85 / 2024-08-15 10:50:03
A Fast and Accurate End-to-End Method for Remote Sensing Visual Grounding (AITC 2024+摘要)
Remote Sensing Visual Grounding, Fast and Accurate End-to-end RSVG Method, Multi-scale Multimodal Feature Fusion Mechanism.
摘要待审
李重阳 / 中国科学院空天信息创新研究院
张文凯 / 中国科学院空天信息创新研究院
李新明 / 空天信息大学
王宏琦 / 中国科学院空天信息创新研究院
Remote sensing visual grounding (RSVG) holds great promise in the field of remote sensing (RS) and has been extensively studied by researchers in recent years. Existing methods are mostly based on the transformer architecture, leveraging multi-head self-attention mechanisms in multimodal encoders to integrate visual and textual features. However, the multiple self-attention operations within the multilayer multimodal encoders have quadratic time complexity, resulting in significant computational demands and slower model inference speed. Furthermore, simply using multimodal encoders for multimodal feature fusion struggles to express small-scale targets within the feature tokens, leading to poor localization performance for small targets. To address these issues, we propose a fast and accurate end-to-end RSVG method (FAEM). FAEM employs a novel multi-scale multimodal feature fusion mechanism (MMFFM) to replace the multilayer multimodal encoders, enabling faster and more accurate multimodal inference. MMFFM captures regions in the multi-scale visual feature maps that are highly relevant to the query text by broadcasting textual features across these visual feature maps. Additionally, we optimized the loss function commonly used in RSVG methods, addressing issues of negative values and overly simplistic constraint conditions. Experimental results demonstrate the effectiveness of our approach. FAEM achieves the fastest inference speed while maintaining accuracy comparable to current state-of-the-art methods.
重要日期
  • 会议日期

    09月20日

    2024

    09月22日

    2024

  • 08月30日 2024

    初稿截稿日期

  • 09月22日 2024

    注册截止日期

主办单位
山东省人民政府
中国电子学会
承办单位
中国科学院学部
中国科学院空天信创新研究所息
复旦大学
联系方式
移动端
在手机上打开
小程序
打开微信小程序
客服
扫码或点此咨询
Baidu
map