SSFAN: A Compact and Efficient Spectral-Spatial Feature Extraction and Attention-Based Neural Network for Hyperspectral Image Classification

  • Abstract
  • Literature Map
  • Similar Papers
Abstract
Translate article icon Translate Article Star icon
Take notes icon Take Notes

Hyperspectral image (HSI) classification is a crucial technique that assigns each pixel in an image to a specific land cover category by leveraging both spectral and spatial information. In recent years, HSI classification methods based on convolutional neural networks (CNNs) and Transformers have significantly improved performance due to their strong feature extraction capabilities. However, these improvements often come with increased model complexity, leading to higher computational costs. To address this, we propose a compact and efficient spectral-spatial feature extraction and attention-based neural network (SSFAN) for HSI classification. The SSFAN model consists of three core modules: the Parallel Spectral-Spatial Feature Extraction Block (PSSB), the Scan Block, and the Squeeze-and-Excitation MLP Block (SEMB). After preprocessing the HSI data, it is fed into the PSSB module, which contains two parallel streams, each comprising a 3D convolutional layer and a 2D convolutional layer. The 3D convolutional layer extracts spectral and spatial features from the input hyperspectral data, while the 2D convolutional layer further enhances the spatial feature representation. Next, the Scan Block module employs a layered scanning strategy to extract spatial information at different scales from the central pixel outward, enabling the model to capture both local and global spatial relationships. The SEMB module combines the Spectral-Spatial Recurrent Block (SSRB) and the MLP Block. The SSRB, with its adaptive weight assignment mechanism in the SToken Module, flexibly handles time steps and feature dimensions, performing deep spectral and spatial feature extraction through multiple state updates. Finally, the MLP Block processes the input features through a series of linear transformations, GELU activation functions, and Dropout layers, capturing complex patterns and relationships within the data, and concludes with an argmax layer for classification. Experimental results show that the proposed SSFAN model delivers superior classification performance, outperforming the second-best method by 1.72%, 5.19%, and 1.94% in OA, AA, and Kappa coefficient, respectively, on the Indian Pines dataset. Additionally, it requires less training and testing time compared to other state-of-the-art deep learning methods.

Similar Papers
  • PDF Download Icon
  • Research Article
  • Cite Count Icon 30
  • 10.3390/rs12091395
A Lightweight Spectral–Spatial Feature Extraction and Fusion Network for Hyperspectral Image Classification
  • Apr 28, 2020
  • Remote Sensing
  • Linlin Chen + 2 more

Hyperspectral image (HSI) classification accuracy has been greatly improved by employing deep learning. The current research mainly focuses on how to build a deep network to improve the accuracy. However, these networks tend to be more complex and have more parameters, which makes the model difficult to train and easy to overfit. Therefore, we present a lightweight deep convolutional neural network (CNN) model called S2FEF-CNN. In this model, three S2FEF blocks are used for the joint spectral–spatial features extraction. Each S2FEF block uses 1D spectral convolution to extract spectral features and 2D spatial convolution to extract spatial features, respectively, and then fuses spectral and spatial features by multiplication. Instead of using the full connected layer, two pooling layers follow three blocks for dimension reduction, which further reduces the training parameters. We compared our method with some state-of-the-art HSI classification methods based on deep network on three commonly used hyperspectral datasets. The results show that our network can achieve a comparable classification accuracy with significantly reduced parameters compared to the above deep networks, which reflects its potential advantages in HSI classification.

  • Research Article
  • Cite Count Icon 3
  • 10.1016/j.jiixd.2024.03.002
A two-branch multiscale spectral-spatial feature extraction network for hyperspectral image classification
  • Mar 9, 2024
  • Journal of Information and Intelligence
  • Aamir Ali + 4 more

In the field of hyperspectral image (HSI) classification in remote sensing, the combination of spectral and spatial features has gained considerable attention. In addition, the multiscale feature extraction approach is very effective at improving the classification accuracy for HSIs, capable of capturing a large amount of intrinsic information. However, some existing methods for extracting spectral and spatial features can only generate low-level features and consider limited scales, leading to low classification results, and dense-connection based methods enhance the feature propagation at the cost of high model complexity. This paper presents a two-branch multiscale spectral-spatial feature extraction network (TBMSSN) for HSI classification. We design the multiscale spectral feature extraction (MSEFE) and multiscale spatial feature extraction (MSAFE) modules to improve the feature representation, and a spatial attention mechanism is applied in the MSAFE module to reduce redundant information and enhance the representation of spatial features at multiscale. Then we densely connect series of MSEFE or MSAFE modules respectively in a two-branch framework to balance efficiency and effectiveness, alleviate the vanishing-gradient problem and strengthen the feature propagation. To evaluate the effectiveness of the proposed method, the experimental results were carried out on bench mark HSI datasets, demonstrating that TBMSSN obtained higher classification accuracy compared with several state-of-the-art methods.

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 32
  • 10.3390/rs12122035
Residual Group Channel and Space Attention Network for Hyperspectral Image Classification
  • Jun 24, 2020
  • Remote Sensing
  • Peida Wu + 3 more

Recently, deep learning methods based on three-dimensional (3-D) convolution have been widely used in the hyperspectral image (HSI) classification tasks and shown good classification performance. However, affected by the irregular distribution of various classes in HSI datasets, most previous 3-D convolutional neural network (CNN)-based models require more training samples to obtain better classification accuracies. In addition, as the network deepens, which leads to the spatial resolution of feature maps gradually decreasing, much useful information may be lost during the training process. Therefore, how to ensure efficient network training is key to the HSI classification tasks. To address the issue mentioned above, in this paper, we proposed a 3-DCNN-based residual group channel and space attention network (RGCSA) for HSI classification. Firstly, the proposed bottom-up top-down attention structure with the residual connection can improve network training efficiency by optimizing channel-wise and spatial-wise features throughout the whole training process. Secondly, the proposed residual group channel-wise attention module can reduce the possibility of losing useful information, and the novel spatial-wise attention module can extract context information to strengthen the spatial features. Furthermore, our proposed RGCSA network only needs few training samples to achieve higher classification accuracies than previous 3-D-CNN-based networks. The experimental results on three commonly used HSI datasets demonstrate the superiority of our proposed network based on the attention mechanism and the effectiveness of the proposed channel-wise and spatial-wise attention modules for HSI classification. The code and configurations are released at Github.com.

  • Conference Article
  • 10.1109/icicsp55539.2022.10050698
Lightweight Multilevel Feature Fusion Network for Hyperspectral Image Classification
  • Nov 26, 2022
  • Quanyu Huang + 3 more

Hyperspectral image (HSI) classification is the key technology of remote sensing image processing. In recent years, convolutional neural network (CNN), which is a powerful feature extractor, has been introduced into the field of HSI classification. Since the features of HSI are the basis of HSI classification, how to effectively extract the spectral-spatial features from HSI with CNN has become a research hotspot. The HSI feature extraction network, based on two-dimensional (2D) and three-dimensional (3D) CNN which can extract both spectral and spatial information, may lead to the increase of parameters and computational cost. Compared with 2D CNN and 3D CNN, the number of parameters and computational cost of one-dimensional (1D) CNN will be greatly reduced. However, 1D CNN based algorithms can only extract the spectral information without considering the spatial information. Therefore, in this paper, a lightweight multilevel feature fusion network (LMFFN) is proposed for HSI classification, which aims to achieve efficient extraction of spectral-spatial features and to minimize the number of parameters. The main contributions of this paper are divided into the following two points: First, we design a hybrid spectral-spatial feature extraction network (HSSFEN) to combine the advantages of 1D, 2D and 3D CNN. It introduces the idea of depthwise separable convolution method, which effectively reduces the complexity of the proposed HSSFEN. Then, a multilevel spectral-spatial feature fusion network (MSSFFN) is proposed to further obtain more effective spectral-spatial features, which effectively fuses the bottom spectral-spatial features and the top spectral-spatial features. To demonstrate the performance of our proposed method, a series of experiments are conducted on three HSI datasets, including Indian Pine, University of Pavia, and Salinas Scene datasets. The experimental results indicate that our proposed LMFFN is able to achieve better performance than the manual feature extraction methods and deep learning methods, which demonstrates the superiority of our proposed method.

  • Research Article
  • Cite Count Icon 7
  • 10.1080/01431161.2021.1993464
Two-Stage Attention Network for hyperspectral image classification
  • Nov 6, 2021
  • International Journal of Remote Sensing
  • Peida Wu + 3 more

Considering that the hyperspectral image (HSI) has a large number of spectrum bands, to optimize the features and make full use of more informative features, many papers have introduced attention mechanism to the models based on three-dimensional (3D) convolution. However, though the number of spectrum bands is large, there are many useless bands and noise, which may generate lots of useless features into the subsequent network and affect the learning efficiency of each convolutional layer. Therefore, how to reduce the influence of noise from HSI data itself and the classification process is key to the HSI classification tasks. In this paper, we proposed a 3D convolutional neural network (3D-CNN) based two-stage attention network (TSAN) for HSI classification. For one thing, the spectral-wise attention module in the first stage can optimize the whole spectrum by shielding useless spectrum bands and reducing the noise in the spectrum. For another, more discriminative spectral–spatial features are extracted and sent to the subsequent layers by channel-wise attention mechanism combined with soft thresholding in the second stage. In addition, we introduced non-local block to learn global spatial features and used a multi-scale network to combine the local space and the global space. The experiments carried out on three HSI datasets show that our proposed network for HSI classification tasks can indeed reduce the noise by soft thresholding and achieve promising classification performance.

  • Research Article
  • Cite Count Icon 12
  • 10.1080/01431161.2023.2249598
CNN and Transformer interaction network for hyperspectral image classification
  • Sep 8, 2023
  • International Journal of Remote Sensing
  • Zhongwei Li + 4 more

Convolutional Neural Network (CNN) has developed hyperspectral image (HSI) classification effectively. Although many CNN-based models can extract local features in HSI, it is difficult for them to extract global features. With its ability to capture long-range dependencies, Transformer is gradually gaining prominence in HSI classification, but it may overlook some local details when extracting features. To address these issues, we proposed a CNN and transformer interaction network (CTIN) for HSI classification. Firstly, A dual-branch structure was constructed in which CNN and Transformer are arranged in parallel to simultaneously extract global features and local features in HSI. Secondly, a feature interaction module has been imported between the two branches, thus facilitating a bi-directional flow of information between the global and local feature spaces. In this way, the network structure combines the advantages of CNN and Transformer in extracting features as much as possible. In addition, a token generation method is designed to harness abundant contextual information that is relevant to the centre pixel, and improve the accuracy of the final classification. Experiments were conducted on four hyperspectral datasets (two classical datasets – Indian Pines, Salinas Valley, a new satellite dataset – Yellow River, and an self-made UAV dataset-Yellow River Willow). Experimental results show that the proposed method outperforms the other state-of-the-art methods, with overall accuracies of 99.21%, 99.61%, 92.40%, and 98.17%, respectively.

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 43
  • 10.3390/rs12010125
A Multi-Scale and Multi-Level Spectral-Spatial Feature Fusion Network for Hyperspectral Image Classification
  • Jan 1, 2020
  • Remote Sensing
  • Caihong Mu + 2 more

Extracting spatial and spectral features through deep neural networks has become an effective means of classification of hyperspectral images. However, most networks rarely consider the extraction of multi-scale spatial features and cannot fully integrate spatial and spectral features. In order to solve these problems, this paper proposes a multi-scale and multi-level spectral-spatial feature fusion network (MSSN) for hyperspectral image classification. The network uses the original 3D cube as input data and does not need to use feature engineering. In the MSSN, using different scale neighborhood blocks as the input of the network, the spectral-spatial features of different scales can be effectively extracted. The proposed 3D–2D alternating residual block combines the spectral features extracted by the three-dimensional convolutional neural network (3D-CNN) with the spatial features extracted by the two-dimensional convolutional neural network (2D-CNN). It not only achieves the fusion of spectral features and spatial features but also achieves the fusion of high-level features and low-level features. Experimental results on four hyperspectral datasets show that this method is superior to several state-of-the-art classification methods for hyperspectral images.

  • Research Article
  • Cite Count Icon 84
  • 10.1109/tgrs.2020.3015843
Adaptive DropBlock-Enhanced Generative Adversarial Networks for Hyperspectral Image Classification
  • Sep 1, 2020
  • IEEE Transactions on Geoscience and Remote Sensing
  • Junjie Wang + 3 more

In recent years, hyperspectral image (HSI) classification based on generative adversarial networks (GAN) has achieved great progress. GAN-based classification methods can mitigate the limited training sample dilemma to some extent. However, several studies have pointed out that existing GAN-based HSI classification methods are heavily affected by the imbalanced training data problem. The discriminator in GAN always contradicts itself and tries to associate fake labels to the minority-class samples, and thus impair the classification performance. Another critical issue is the mode collapse in GAN-based methods. The generator is only capable of producing samples within a narrow scope of the data space, which severely hinders the advancement of GAN-based HSI classification methods. In this paper, we proposed an Adaptive DropBlock-enhanced Generative Adversarial Networks (ADGAN) for HSI classification. First, to solve the imbalanced training data problem, we adjust the discriminator to be a single classifier, and it will not contradict itself. Second, an adaptive DropBlock (AdapDrop) is proposed as a regularization method employed in the generator and discriminator to alleviate the mode collapse issue. The AdapDrop generated drop masks with adaptive shapes instead of a fixed size region, and it alleviates the limitations of DropBlock in dealing with ground objects with various shapes. Experimental results on three HSI datasets demonstrated that the proposed ADGAN achieved superior performance over state-of-the-art GAN-based methods. Our codes are available at https://github.com/summitgao/HC_ADGAN

  • Research Article
  • 10.1080/17538947.2025.2520480
Spectral–spatial mamba adversarial defense network for hyperspectral image classification
  • Aug 1, 2025
  • International Journal of Digital Earth
  • Zhongqiang Zhang + 4 more

Deep learning models have obtained great success in hyperspectral image classification tasks. Nevertheless, they are usually vulnerable to adversarial attacks. Some existing works have been made to defend against adversarial attacks in HSI classification. These works primarily focus on lots of adversarial samples and spatial relationships while overlooking the strong long-range dependencies from HSI. To alleviate this problem, we propose a novel spectral spatial mamba adversarial defense network (SSMADNet) for hyperspectral adversarial image classification. It includes a dense involution branch, a spectral mamba branch, and a spatial multiscale mamba branch. The dense involution branch extracts embedding features via three dense involution layers. The spectral mamba branch can learn the spectral sequence information from HSI adversarial samples. The spatial multiscale mamba branch can model the long-range interaction of the whole image. Finally, a spectral spatial feature enhancement module is designed to adaptively enhance useful spectral spatial features of HSI. Extensive experimental results demonstrate that on five HSI adversarial datasets, the proposed SSMADNet achieves higher classification accuracies than state-of-the-art adversarial defense methods. In particular, our method obtains best OA (93.80%) on the Botswana adversarial data, which is much higher than the suboptimal method (OA = 90.30%).

  • Research Article
  • Cite Count Icon 2
  • 10.1364/josaa.478585
Hybrid spatial-spectral generative adversarial network for hyperspectral image classification.
  • Feb 21, 2023
  • Journal of the Optical Society of America A
  • Chao Ma + 5 more

In recent years, generative adversarial networks (GNAs), consisting of two competing 2D convolutional neural networks (CNNs) that are used as a generator and a discriminator, have shown their promising capabilities in hyperspectral image (HSI) classification tasks. Essentially, the performance of HSI classification lies in the feature extraction ability of both spectral and spatial information. The 3D CNN has excellent advantages in simultaneously mining the above two types of features but has rarely been used due to its high computational complexity. This paper proposes a hybrid spatial-spectral generative adversarial network (HSSGAN) for effective HSI classification. The hybrid CNN structure is developed for the construction of the generator and the discriminator. For the discriminator, the 3D CNN is utilized to extract the multi-band spatial-spectral feature, and then we use the 2D CNN to further represent the spatial information. To reduce the accuracy loss caused by information redundancy, a channel and spatial attention mechanism (CSAM) is specially designed. To be specific, a channel attention mechanism is exploited to enhance the discriminative spectral features. Furthermore, the spatial self-attention mechanism is developed to learn the long-term spatial similarity, which can effectively suppress invalid spatial features. Both quantitative and qualitative experiments implemented on four widely used hyperspectral datasets show that the proposed HSSGAN has a satisfactory classification effect compared to conventional methods, especially with few training samples.

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 16
  • 10.3390/rs13183590
A Spectral Spatial Attention Fusion with Deformable Convolutional Residual Network for Hyperspectral Image Classification
  • Sep 9, 2021
  • Remote Sensing
  • Tianyu Zhang + 3 more

Convolutional neural networks (CNNs) have exhibited excellent performance in hyperspectral image classification. However, due to the lack of labeled hyperspectral data, it is difficult to achieve high classification accuracy of hyperspectral images with fewer training samples. In addition, although some deep learning techniques have been used in hyperspectral image classification, due to the abundant information of hyperspectral images, the problem of insufficient spatial spectral feature extraction still exists. To address the aforementioned issues, a spectral–spatial attention fusion with a deformable convolution residual network (SSAF-DCR) is proposed for hyperspectral image classification. The proposed network is composed of three parts, and each part is connected sequentially to extract features. In the first part, a dense spectral block is utilized to reuse spectral features as much as possible, and a spectral attention block that can refine and optimize the spectral features follows. In the second part, spatial features are extracted and selected by a dense spatial block and attention block, respectively. Then, the results of the first two parts are fused and sent to the third part, and deep spatial features are extracted by the DCR block. The above three parts realize the effective extraction of spectral–spatial features, and the experimental results for four commonly used hyperspectral datasets demonstrate that the proposed SSAF-DCR method is superior to some state-of-the-art methods with very few training samples.

  • Research Article
  • Cite Count Icon 8
  • 10.1109/lgrs.2022.3171536
DRGCN: Dual Residual Graph Convolutional Network for Hyperspectral Image Classification
  • Jan 1, 2022
  • IEEE Geoscience and Remote Sensing Letters
  • Rong Chen + 2 more

Recently, graph convolutional network (GCN) has drawn increasing attention in hyperspectral image (HSI) classification, as it can process arbitrary non-Euclidean data. However, dynamic GCN that refines the graph heavily relies on the graph embedding in the previous layer, which will result in performance degradation when the embedding contains noise. In this letter, we propose a novel dual residual graph convolutional network (DRGCN) for HSI classification that integrates two adjacency matrices of dual GCN. In detail, one GCN applies a soft adjacency matrix to extract spatial features, the other utilizes the dynamic adjacency matrix to extract global context-aware features. Subsequently, the features extracted by dual GCN are fused to make full use of the complementary and correlated information among two graph representations. Moreover, we introduce residual learning to optimize graph convolutional layers during the training process, to alleviate the over-smoothing problem. The advantage of dual GCN is that it can extract robust and discriminative features from HSI. Extensive experiments on four HSI data sets, including Indian Pines, Pavia University, Salinas, and Houston University, demonstrate the effectiveness and superiority of our proposed DRGCN, even with small-sized training data.

  • Research Article
  • Cite Count Icon 7
  • 10.1016/j.neunet.2025.107311
Dual selective fusion transformer network for hyperspectral image classification.
  • Jul 1, 2025
  • Neural networks : the official journal of the International Neural Network Society
  • Yichu Xu + 3 more

Dual selective fusion transformer network for hyperspectral image classification.

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 8
  • 10.3390/rs13071290
Wide Sliding Window and Subsampling Network for Hyperspectral Image Classification
  • Mar 28, 2021
  • Remote Sensing
  • Jiangbo Xi + 6 more

Recently, deep learning methods, for example, convolutional neural networks (CNNs), have achieved high performance in hyperspectral image (HSI) classification. The limited training samples of HSI images make it hard to use deep learning methods with many layers and a large number of convolutional kernels as in large scale imagery tasks, and CNN-based methods usually need long training time. In this paper, we present a wide sliding window and subsampling network (WSWS Net) for HSI classification. It is based on layers of transform kernels with sliding windows and subsampling (WSWS). It can be extended in the wide direction to learn both spatial and spectral features more efficiently. The learned features are subsampled to reduce computational loads and to reduce memorization. Thus, layers of WSWS can learn higher level spatial and spectral features efficiently, and the proposed network can be trained easily by only computing linear weights with least squares. The experimental results show that the WSWS Net achieves excellent performance with different hyperspectral remotes sensing datasets compared with other shallow and deep learning methods. The effects of ratio of training samples, the sizes of image patches, and the visualization of features in WSWS layers are presented.

  • Research Article
  • Cite Count Icon 57
  • 10.1109/tgrs.2021.3049377
NAS-Guided Lightweight Multiscale Attention Fusion Network for Hyperspectral Image Classification
  • Jan 21, 2021
  • IEEE Transactions on Geoscience and Remote Sensing
  • Jianing Wang + 6 more

Deep learning (DL) has become a hot topic in the research field of hyperspectral image (HSI) classification. However, with increasing depth and size of deep learning methods, its application in mobile and embedded vision applications has brought great challenges. In this article, we address a network architecture search (NAS)-guided lightweight spectral–spatial attention feature fusion network (LMAFN) for HSI classification. The overall architecture of the proposed network is guided by several conclusions of NAS, which achieves fewer parameters and lower computation cost with deeper network structure by exploiting multiscale Ghost grouped with efficient channel attention (ECA) module for adaptively adjusting the weights of different channels. It helps fully extract spectral–spatial discriminant features to avoid information loss of the dimension reduction operation. Specifically, a multilayer feature fusion method is proposed to extract the fusion information of the spectral–spatial features of each layer by considering complementary information of different hierarchical structures. Therefore, high-lever spectral–spatial attributes are gradually exploited along with the increase in layers and the fusion of layers. The experimental verification on three real HSI data sets demonstrates that the proposed framework presents more satisfying classification performance and efficiency with deeper network structure and lower parameter size.

Save Icon
Up Arrow
Open/Close
  • Ask R Discovery Star icon
  • Chat PDF Star icon

AI summaries and top papers from 250M+ research sources.