Abstract

Transformer-based networks, which can well model the global characteristics of inputted data using the attention mechanism, have been widely applied to hyperspectral image (HSI) classification and achieved promising results. However, the existing networks fail to explore complex local land cover structures in different scales of shapes in hyperspectral remote sensing images. Therefore, a novel network named multiscale and cross-level attention learning (MCAL) network is proposed to fully explore both the global and local multiscale features of pixels for classification. To encounter local spatial context of pixels in the transformer, a multiscale feature extraction (MSFE) module is constructed and implemented into the transformer-based networks. Moreover, a cross-level feature fusion (CLFF) module is proposed to adaptively fuse features from the hierarchical structure of MSFEs using the attention mechanism. Finally, the spectral attention module (SAM) is implemented prior to the hierarchical structure of MSFEs, by which both the spatial context and spectral information are jointly emphasized for hyperspectral classification. Experiments over several benchmark datasets demonstrate that the proposed MCAL obviously outperforms both the convolutional neural network (CNN)-based and transformer-based state-of-the-art networks for hyperspectral classification.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call