Scale-Semantic Joint Decoupling Network for Image-Text Retrieval in Remote Sensing

Chengyu Zheng,Ning Song,Zhiqiang Wei,Jie Nie,Lei Huang,Ruoyu Zhang

doi:10.1145/3603628

Abstract

Image-text retrieval in remote sensing aims to provide flexible information for data analysis and application. In recent years, state-of-the-art methods are dedicated to “scale decoupling” and “semantic decoupling” strategies to further enhance the capability of representation. However, these previous approaches focus on either the disentangling scale or semantics but ignore merging these two ideas in a union model, which extremely limits the performance of cross-modal retrieval models. To address these issues, we propose a novel Scale-Semantic Joint Decoupling Network (SSJDN) for remote sensing image-text retrieval. Specifically, we design the Bidirectional Scale Decoupling (BSD) module, which exploits Salience Extraction Map (SEM) and Salience Suppression Map (SSM) units to adaptively extract potential features and suppress cumbersome features at other scales in a bidirectional pattern to yield different scale clues. Besides, we design the Label-supervised Semantic Decoupling (LSD) module by leveraging the category semantic labels as prior knowledge to supervise images and texts probing significant semantic-related information. Finally, we design a Semantic-guided Triple Loss (STL), which adaptively generates a constant to adjust the loss function to improve the probability of matching the same semantic image and text and shorten the convergence time of the retrieval model. Our proposed SSJDN outperforms state-of-the-art approaches in numerical experiments conducted on four benchmark remote sensing datasets.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Scale-Semantic Joint Decoupling Network for Image-Text Retrieval in Remote Sensing

Abstract

Talk to us

Similar Papers

More From: ACM Transactions on Multimedia Computing, Communications, and Applications

Lead the way for us

Journal: ACM Transactions on Multimedia Computing, Communications, and Applications	Publication Date: Aug 24, 2023
Citations: 4

Similar Papers

Global-aware Fragment Representation Aggregation Network for image-text retrieval
Di Wang ... Lihuo He
Pattern Recognition | VOL. -
Di Wang, et. al.Di Wang ... Lihuo He
01 Oct 2024
Pattern Recognition | VOL. -

On the Limitations of Visual-Semantic Embedding Networks for Image-to-Text Information Retrieval.
Yan Gong ... Hui Fang
Journal of Imaging | VOL. 7
Yan Gong, et. al.Yan Gong ... Hui Fang
26 Jul 2021
Journal of Imaging | VOL. 7

Integrating listwise ranking into pairwise-based image-text retrieval
Zheng Li ... Yanjun Wang
Knowledge-Based Systems | VOL. 287
Zheng Li, et. al.Zheng Li ... Yanjun Wang
23 Jan 2024
Knowledge-Based Systems | VOL. 287

Cross-Modal and Uni-Modal Soft-Label Alignment for Image-Text Retrieval
Hailang Huang ... Ziyu Shang
Proceedings of the AAAI Conference on Artificial Intelligence | VOL. 38
Hailang Huang, et. al.Hailang Huang ... Ziyu Shang
24 Mar 2024
Proceedings of the AAAI Conference on Artificial Intelligence | VOL. 38

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Scale-Semantic Joint Decoupling Network for Image-Text Retrieval in Remote Sensing

Abstract

Talk to us

Similar Papers

More From: ACM Transactions on Multimedia Computing, Communications, and Applications