Hyperspectral Image Classification Based on Transposed Convolutional Neural Network Transformer

Baisen Liu,Zongting Jia,Weili Kong,Penggang Guo

doi:10.3390/electronics12183879

Baisen Liu, Zongting Jia + Show 2 more

Open Access

https://doi.org/10.3390/electronics12183879

Copy DOI

Abstract

Hyperspectral imaging is a technique that captures images of objects within a wide spectrum range, allowing for the acquisition of additional spectral information to reveal subtle variations and compositional components in the objects. Convolutional neural networks (CNNs) have shown remarkable feature extraction capabilities for HSI classification, but their ability to capture deep semantic features is limited. On the other hand, transformer models based on attention mechanisms excel at handling sequential data and have demonstrated great potential in various applications. Motivated by these two facts, this paper proposes a multiscale spectral–spatial transposed transformer (MSSTT) that captures the high-level semantic features of an HSI while preserving the spectral information as much as possible. The MSSTT consists of a spectral–spatial Inception module that extracts spectral and spatial features using multiscale convolutional kernels, and a spatial transpose Inception module that further enhances and extracts spatial information. A transformer model with a cosine attention mechanism is also included to extract deep semantic features, with the QKV matrix constrained to ensure the output remains within the activation range. Finally, the classification results are obtained by applying a linear layer to the learnable tokens. The experimental results from three public datasets show that the proposed MSSTT outperforms other deep learning methods in HSI classification. On the India Pines, Pavia University, and Salinas datasets, accuracies of 97.19%, 99.47%, and 99.90% were achieved, respectively, with a training set proportion of 5%.

Full Text