Abstract

Hyperspectral image (HSI) classification tasks have been adopted in huge applications of remote sensing recently. With the rise of deep learning development, it becomes crucial to investigate how to exploit spatial–spectral features. The traditional approach is to stack models that can encode spatial–spectral features, coupling sufficient information as much as possible, before the classification model. However, this sequential stacking tends to cause information redundancy. In this paper, a novel network utilizing the channel attention combined discrete cosine transform (DCTransformer) to extract spatial–spectral features has been proposed to address this issue. It consists of a detail spatial feature extractor (DFE) with CNN blocks and a base spectral feature extractor (BFE) utilizing the channel attention mechanism (CAM) with a discrete cosine transform (DCT). Firstly, the DFE can extract detailed context information using a series of layers of a CNN. Further, the BFE captures spectral features using channel attention and stores the wider frequency information by utilizing the DCT. Ultimately, the dynamic fusion mechanism has been adopted to fuse the detail and base features. Comprehensive experiments show that the DCTransformer achieves a state-of-the-art (SOTA) performance in the HSI classification task, compared to other methods on four datasets, the University of Houston (UH), Indian Pines (IP), MUUFL, and Trento datasets. On the UH dataset, the DCTransformer achieves an OA of 94.40%, AA of 94.89%, and kappa of 93.92.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call