Abstract

Content-based remote sensing (RS) image retrieval (CBRSIR) is a critical way to organize high-resolution RS (HRRS) images in the current big data era. The increasing volume of HRRS images from different satellites and sensors leads to more attention to the cross-source CSRSIR (CS-CBRSIR) problem. Due to the data drift, one crucial problem in CS-CBRSIR is the modality discrepancy. Most existing methods focus on finding a common feature space for various HRRS images to address this issue. In this space, their similarity relations can be measured directly to obtain the cross-source retrieval results straight. This way is feasible and reasonable, however, the specific information corresponding to HRRS images from different sources is always ignored, limiting retrieval performance. To overcome this limitation, we develop a new model for CS-CBRSIR in this paper named dual modality collaborative learning (DMCL). To fully explore the specific information from diverse HRRS images, DMCL first introduces ResNet50 as the feature extractor. Then, a common space mutual learning module is developed to map the specific features into a common space. Here, the modality discrepancy is reduced from the aspects of features and their distributions. Finally, to supplement the specific knowledge to the common features, we develop modality transformation and the dual-modality feature learning modules. Their function is to transmit the specific knowledge from different sources mutually and fuse the specific and common features adaptively. The comprehensive experiments are conducted on a public dataset. Compared with many existing methods, the behavior of our DMCL is stronger. These encouraging results for a public dataset indicate that the proposed DMCL is useful in CS-CBRSIR tasks.

Highlights

  • With the advancement in remote sensing (RS) observation technologies, the capability of capturing RS images has been enhanced dramatically

  • We find that the common feature learning and adaptive dual-modality fusion blocks make positive contributions based on the specific feature extractor

  • This paper proposes a new model for CS-CSRSIR tasks named dual modality collaborative learning (DMCL)

Read more

Summary

Introduction

With the advancement in remote sensing (RS) observation technologies, the capability of capturing RS images has been enhanced dramatically. Many useful cross-modal retrieval methods have been introduced for natural images and they achieve successes in different applications, we cannot apply them to deal with HRRS CS-CBRSIR tasks immediately. The mutual effect of images from different sources is not considered Those methods emphasize the importance of common space learning but ignore the modality-specific features, which are crucial to the CS-CBRSIR retrieval task. A new HRRS CS-CBRSIR method (DMCL) is proposed based on the framework of cross-modal learning in this paper. DMCL can learn the discriminative specific and common features from different types of HRRS images, which are beneficial to CS-CBRSIR tasks.

Unified-Source Content-Based Remote Sensing Image Retrieval (US-CBRSIR)
Cross-Modal Retrieval in Remote Sensing
The Overview of the Framework
Specific Feature Extractor
Common Feature Learning
Adaptive Dual-Modality Fusion
Overall Training and Inference Process
Experiment Setup
Reasonableness of Backbone
Compared with Diverse Methods
Methods
Ablation Study
Sensitive Study
Conclusions
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call