Abstract
Deep learning has revolutionized the remote sensing image processing techniques over the past few years. Nevertheless, annotating high-quality samples is difficult and time-consuming, which limits the performance of deep neural networks because of insufficient supervision information. Aiming to solve this contradiction, we investigate the multimodal self-supervised learning (MultiSSL) paradigm for pre-training and classification of remote sensing image. Specifically, the proposed self-supervised feature learning model consists of asymmetric encoder–decoder structure, in which deep unified encoder learns high-level key information characterizing multimodal remote sensing data and task-specific lightweight decoders are developed to reconstruct original data. To further enhance feature extraction capability, the cross-attention layers are utilized to exchange information contained in heterogeneous characteristics, thus learning more complementary information from multimodal remote sensing data. In fine-tuning stage, the pre-trained encoder and cross-attention layer serve as feature extractor, and leaned characteristics are combined with corresponding spectral information for land cover classification through a lightweight classifier. The self-supervised pre-training model can learn high-level key features from unlabeled samples, thereby utilizing the feature extraction capability of deep neural networks while reducing their dependence on annotated samples. Compared with existing classification paradigms, the proposed multimodal self-supervised pre-training and fine-tuning scheme achieves superior performance for remote sensing image land cover classification.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.