Abstract

Advancements in Earth observation technologies have greatly enhanced the potential of integrating hyperspectral (HS) images with Light Detection and Ranging (LiDAR) data for land use and land cover classification. Despite this, most existing methods primarily focus on employing deep network layers to extract features from two heterogeneous data modalities, often overlooking a gradual modeling data representation approach from shallow to deep layers. Furthermore, excessive network layers can result in the deterioration of modality-specific features, therefore lowering the classification performance. The paper proposed a novel cross-modal feature aggregation and enhancement network for the joint classification of HSI and LiDAR data. Initially, a cross-modal feature fusion module is developed to exploit spatial scale consistency to complete the interchange and fusion of feature embedding at the pixel level, preserving the original information from the two heterogeneous modalities to a certain degree. Then two straightforward strategies (i.e., addition and concatenation) are employed in the shallow network layers before being sent to the transformer encoder. The former facilitates the model’s ability to discern more subtle distinctions and refine spatial location details. The latter ensures the preservation of information integrity, effectively mitigating the risk of feature loss. Moreover, invertible neural networks and a feature enhancement module are introduced, leveraging the complementary information of HSI and LiDAR data to enhance the detail and texture information extracted in deeper layers. Extensive experiments on Houston2013, Trento, and MUUFL datasets demonstrate that the proposed method outperforms several state-of-the-art models in three evaluation metrics, achieving an accuracy improvement of up to 2%. The proposed model brings new inspirations for HSI and LiDAR classification, which is critical for accurate environmental monitoring, urban planning, and precision agriculture. The source code is publicly accessible at https://github.com/zhangyiyan001/CMFAEN.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.