Abstract

With the development of high-resolution satellites, more and more attention has been paid to remote sensing (RS) scene classification. Convolutional neural networks (CNNs), which replace the traditional handcrafted features with a learning-based feature extraction mechanism, are widely used in scene classification. But CNNs are less effective in deriving long-range contextual relations, which limits the further improvement. Visual transformer (VT), an emerging image processing method, provides a new perspective for RS scene classification by directly acquiring long-range features. Although there have been limited works combining CNN and VT through simple concatenation, the collaborations between them are insufficient. To address these issues, we propose a local and long-range collaborative framework (L2RCF). First, we design a dual-stream structure to extract the local and long-range features. Second, a cross-feature calibration (CFC) module is designed for them to improve representation of the fusion features. Then, combining deep supervision (DS) and deep mutual learning (DML), a novel joint loss is proposed to enhance the dual-stream feature extractor and further improve the fused features. Finally, a two-stage semi-supervised training strategy is designed to improve performance with unlabeled samples. To demonstrate the effectiveness of L2RCF, we conducted experiments on three widely used RS scene classification data sets: RSSCN7, AID, and NWPU. The results show that L2RCF performs significantly better compared with some state-of-the-art scene classification methods.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.