Abstract

Convolutional neural networks (CNNs) have attained remarkable performance in hyperspectral image (HSI) classification owing to excellent locally modeling ability. However, the existing CNNs cannot capture global context information from HSI. Recently, vision transformer (ViT) has been proven to be effective in the image field. However, its retrieval of local space information in HSI classification is not satisfactory, and the input mode always leads to the loss of spatial location information and local information. In this letter, we propose a novel convolution transformer fusion splicing network (CTFSN) for HSI classification. From the perspective of local information and global information, this method adopts two feature fusion ways of addition and channel stacking to capture hyperspectral features. First, to effectively utilize shallow features and preserve spatial location information, we propose a residual splicing convolution block to serialize HSI. In addition, the convolutional transformer fusion block (CTFB) is designed to achieve additional local modeling on the basis of capturing global features. Finally, the dual branch fusion splicing module is adopted to fuse and splice the local features from the depthwise residual block and the global features from CTFB. Experimental results on three widely used datasets show that our method is superior to several other state-of-the-art classification methods.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call