Abstract

Deep learning models have shown great potential in remote sensing image processing and analysis. Nevertheless, there are insufficient labeled samples to train deep networks, which seriously affects the performance of these models. To resolve this contradiction, we propose a generative self-supervised feature learning (S2FL) architecture for multimodal remote sensing image land cover classification. Specifically, multiple complementary observed views are constructed from multimodal remote sensing images, which are employed for following generative self-supervised learning. The proposed S2FL architecture is capable of extracting high-level meaningful feature representations from multiview data, and this process does not require any labeled information, providing a feasible solution to relieve the urgent need for annotated samples. The learned features are normalized and merged with corresponding spectral information to further improve the discriminative capability of feature representations, and we utilize these fused features for land cover classification. Compared with existing supervised, semi-supervised, and self-supervised approaches, the proposed generative self-supervised model achieves superior performance in terms of feature learning and land cover classification, especially in the small sample classification case.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call