Abstract

Hyperspectral image (HSI) classification, incorporating both spatial and spectral information, is a crucial topic in earth observation and land cover analysis. However, ground objects with similar spectral attributes are still the challenges for finer classifications. Recently, deep learning-based multimodality fusion provides promising solutions by exploiting LiDAR data with its geometric information to fuse with spectral attributes. However, the labor-intensive and time-consuming multimodality data annotation limits the performance of supervised deep learning technologies. How to address the semantic disparity between the LiDAR data and HSIs, and learning transferable representations for cross-scene classifications are still challenging. In this paper, we propose a multimodal fusion relational network with meta-learning (MFRN-ML) to solve these challenges. Specifically, the MFRN-ML incorporates the multimodal learning and few-shot learning (FSL) into a three-stage task-based learning framework to learn the transferable cross-modality representations for few-shot HSI and LiDAR collaborative classification. First, a multimodal fusion relational network, composed of a cross-modality feature fusion module and a relation learning module, is proposed to address the challenge of limited annotations in multimodal learning in a data-adaptive way. Then, a three-stage task-based learning framework can train the network to learn transferable representations with few labeled samples for cross-scene classification. We perform experiments on four multimodal datasets collected by different sensors. Compared with existing supervised, semi-supervised, and meta-learning methods, MFRN-ML attains state-of-the-art performances in few-shot tasks. Particularly, our method shows promising generalization ability on unseen categories across different domains.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call