Abstract

Remote sensing (RS) crossmodal text-image retrieval has become a research hotspot in recent years for its application in semantic localization. However, since multiple inferences on slices are demanded in semantic localization, designing a crossmodal retrieval model with less computation but well performance becomes an emergent and challenging task. In this article, considering the characteristics of multi-scale and target redundancy in RS, a concise but effective crossmodal retrieval model (LW-MCR) is designed. The proposed model incorporates multi-scale information and dynamically filters out redundant features when encoding RS image, while text features are obtained via lightweight group convolution. To improve the retrieval performance of LW-MCR, we come up with a novel hidden supervised optimization method based on knowledge distillation. This method enables the proposed model to acquire dark knowledge of the multi-level layers and representation layers in the teacher network, which significantly improves the accuracy of our lightweight model. Finally, on the basis of contrast learning, we present a method employing unlabeled data to boost the performance of RS retrieval model further. The experiment results on four RS image-text datasets demonstrate the efficiency of LW-MCR in RS crossmodal retrieval (RSCR) tasks. We have released some codes of the semantic localization and made it open to access at <uri>https://github.com/xiaoyuan1996/retrievalSystem</uri>.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.