Multi-scale spectral-spatial dual-transformer network for hyperspectral image classification

Xiuping Jia,Zhaojie Pan,Hang Fu,Genyun Sun,Aizhu Zhang,Sunjinyan Ding

doi:10.1080/01431161.2023.2203340

Abstract

ABSTRACT Deep learning methods have shown great advantages in hyperspectral image (HSI) classification tasks. In particular, convolutional neural network (CNN)-based methods for HSI classification have made great progress. However, it is difficult for CNN to process long-range spatial and spectral information. Recently, a novel deep learning model called transformer has demonstrated its potential to replace CNN in various classification tasks with its amazing performance. In this letter, a multi-scale spectral-spatial dual-transformer network (MS3DT) is proposed to deeply consider the spectral-spatial features via transformer. Specifically, MS3DT consists of a feature pyramid network (FPN), a spectral transformer subnetwork (SPECT) and a spatial transformer subnetwork (SPAT). To utilize the complementary multi-scale characteristics, we introduce FPN to capture shallow-to-deep and spectral-spatial features. To improve the representational capacity in spatial and spectral domains, SPECT is exploited to extract long-range spectral correlation over local spectral features, and SPAT is designed to explore the contextual information for better refinement. Therefore, MS3DT is able to adaptively recalibrate the nonlinear interdependence of shallow-to-deep spectral-spatial features by merging the two subnetworks. Experiments on two widely used HSI datasets show that the MS3DT can outperform the state-of-the-art (SOTA) algorithm. The source codes will be available at https://github.com/RsAI-lab/MSSSDT.

Full Text