Abstract

Deep learning has become a mainstream method of hyperspectral image (HSI) classification. Many DL-based methods exploit spatial-spectral features to achieve better classification results. However, due to the complex backgrounds in HSIs, existing methods usually show unsatisfactory performance for the class pixels located on the land-cover category boundary area. In large part, this is because the network is susceptible to interference by the irrelevant information around the target pixel in the training stage, resulting in inaccurate feature extraction. In this paper, a new multibranch transformer architecture (SST-M) that assembles spatial attention and extracts spectral features is proposed to address this problem. The transformer model has a global receptive field and thus can integrate global spatial position information in the HSI cube. Meanwhile, we design a spatial sequence attention model to enhance the useful spatial location features and weaken invalid information. Considering that HSIs contain considerable spectral information, a spectral feature extraction model is designed to extract discriminative spectral features, replacing the widely used PCA method and obtaining better classification results than it. Finally, inspired by semantic segmentation, a mask prediction model is designed to classify all of the pixels in the HSI cube; this guides the neural network to learn precise pixel characteristics and spatial distributions. To verify the effectiveness of our algorithm (SST-M), quantitative experiments were conducted in three well-known datasets, namely, IP, PU, and KSC. The experimental results demonstrate that the proposed model achieves better performance than the other state-of-the-art methods.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call