Hyperspectral Image Classification With Pre-Activation Residual Attention Network
Recently, convolutional neural networks (CNNs) have been introduced for hyperspectral image (HSI) classification and shown considerable classification performance. However, the previous CNNs designed for spectral-spatial HSI classification lay stress on the learning for the spatial correlation of HSI data and neglect the channel responses of feature maps. Furthermore, the lack of training samples remains the major challenge for CNN-based HSI classification methods to achieve better performance. To address the aforementioned issues, this paper proposes a new end-to-end pre-activation residual attention network (PRAN) for HSI classification. The pre-activation mechanism and attention mechanism are introduced into the proposed network, and a pre-activation residual attention block (PRAB) is designed, which allows the proposed network to carry adaptively feature recalibration of channel responses and learn more robust spectral-spatial joint feature representations. The proposed PRAN is equipped with two PRABs and several convolutional layers with different kernel sizes, which enables the PRAN to extract high-level discriminative features. Experimental results on three benchmark HSI datasets reveal that the proposed method is provided with competitive performance over several state-of-the-art HSI classification methods, especially when the training set size is relatively small.
- Conference Article
1
- 10.1145/3641584.3641609
- Sep 22, 2023
With the continuous innovation in deep learning, it has become a major direction for scholars to introduce the knowledge of deep learning into hyperspectral image classification to enhance its classification accuracy. Convolutional Neural Networks (CNN) are one of the most commonly used deep learning-based visual data processing methods, and are widely used in hyperspectral image (HSI) classification by virtue of their excellent contextual modeling capability. Since the performance of HSI classification is highly dependent on spatial and spectral information, this paper proposes a hyperspectral image classification method using 3D attention mechanism in collaboration with Transformer for hyperspectral image classification in view of the problems that the current hyperspectral image classification models with the framework of CNN have insufficient spatial spectral feature extraction and fail to excavate and represent the sequence properties of spectral features well. In this paper, we introduce a variant Transformer model based on a hybrid model of both improved 3D-CNN and 2D-CNN, combining complementary information of spatial spectrum and spectra in the form of 3D convolution and 2D convolution on CNN, and adding a variant attention mechanism module to strengthen spatial texture features, while combining grouped transfer Transformer to jump connection to enable the lower layer to better learn the upper layer features. Firstly, a variant channel attention mechanism is introduced on 3D-CNN to enhance the acquisition of spectral information of image features by 3D-CNN. Secondly, a variant spatial attention mechanism is introduced to enable 3D-CNN to better acquire the spatial information of hyperspectral images in the network, and subsequently the acquired spatial and spectral feature information is passed to 2D-CNN to enable it to better acquire local feature information. Finally, the acquired image feature information is passed to the variant Transformer model to make up for the fact that CNN can only acquire hyperspectral image features in local contexts, enabling it to better acquire global feature information on feature sequences. The experimental results show that the proposed model is experimented on two hyperspectral datasets, Indian Pines and Pavia University, and the overall classification accuracy (OA), average classification accuracy (AA), and Kappa coefficient reach up to 99.59%, 99.31%, and 99.45%, respectively, on the PU dataset, compared with the current cutting-edge techniques. The classification accuracy has been improved.
- Research Article
2
- 10.1371/journal.pone.0322345
- May 23, 2025
- PloS one
Hyperspectral Image (HSI) classification tasks are usually impacted by Convolutional Neural Networks (CNN). Specifically, the majority of models using traditional convolutions for HSI classification tasks extract redundant information due to the convolution layer, which makes the subsequent network structure produce a large number of parameters and complex computations, so as to limit their classification effectiveness, particularly in situations with constraints on computational power and storage capacity. To address these issues, this paper proposes a lightweight multi-layer feature fusion classification method for hyperspectral images based on spatial and channel reconstruction (SCNet). Firstly, this method reduces redundant computations of spatial and spectral features by introducing Spatial and Channel Reconstruction Convolutions (SCConv), a novel convolutional compression method. Secondly, the proposed network backbone is stacked with multiple SCConv modules, which allows the network to capture spatial and spectral features that are more beneficial for hyperspectral image classification. Finally, to effectively utilize the multi-layer feature information generated by SCConv modules, a multi-layer feature fusion (MLFF) unit was designed to connect multiple feature maps at different depths, thereby obtaining a more robust feature representation. The experimental results demonstrate that, compared to seven other hyperspectral image classification methods, this network has significant advantages in terms of the number of parameters, model complexity, and testing time. These findings have been validated through experiments on four benchmark datasets.
- Research Article
84
- 10.1109/tgrs.2022.3185640
- Jan 1, 2022
- IEEE Transactions on Geoscience and Remote Sensing
Convolutional Neural Networks (CNNs) have been extensively applied to hyperspectral (HS) image classification tasks and achieved promising performance. However, for CNN based HS image classification methods, it is hard to depict the dependencies among HS image pixels in long-range distanced positions and bands. Moreover, the limited receptive field of the convolutional layers extremely hinders the development of the CNN structure. To tackle these problems, in this paper, the novel Bottleneck Spatial-Spectral Transformer (BS2T) is proposed to depict the long-range global dependencies of HS image pixels, which can be regarded as a feature extraction module for HS image classification networks. More specifically, inspired by Bottleneck Transformer in computer vision, for HS image feature extraction, the proposed BS2T is incorporated with a feature contraction module, a multi-head spatial-spectral self-attention (MHS2A) module and a feature expansion module. In this way, convolutional operations are replaced by the MHS2A to capture the long-range dependency of HS pixels regardless of their spatial position and distance. Meanwhile, in the MHS2A module, to highlight the spectral features of HS images, we introduce the spectral information and content spatial positional information to classical multi-head self-attentions to make the attentions more positional aware and spectral aware. On this basis, a dual-branch HS image classification framework based on 3D CNN and BS2T is defined for jointly extracting the local-global features of HS images. Experimental results on three public HS image classification datasets show that the proposed classification framework achieves a significant improvement when comparing with the state-of-the-art methods. The source code of the proposed framework can be downloaded from https://github.com/srxlnnu/BS2T.
- Research Article
10
- 10.1109/jstars.2021.3123371
- Jan 1, 2021
- IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing
Over the past few years, convolutional neural network (CNN) has been broadly adopted in remote sensing (RS) imagery processing areas due to its impressive capabilities in feature extraction. Nevertheless, it is still a challenge for CNN-based hyperspectral image (HSI) classification methods to extract more effective spectral-spatial features considering all spectral bands. Driven by this issue, we propose a novel approach to cope with the HSI classification task, referring to the multi-level joint feature extraction network (MJFEN). The proposed network makes full use of the information on each channel of HSI and transforms it into valid channel-wised spatial features through a designed convolution process. Moreover, these feature maps form global attention details to guide the extraction of spectral-spatial features, which are taken to the next level for further feature mining. Then, the features obtained at different levels are integrated for ground object classification. In contrast with several state-of-the-art HSI classification methods on four public datasets, experimental results demonstrate the effectiveness and remarkable feature extraction capability of our proposed approach.
- Research Article
11
- 10.1109/access.2020.2974025
- Jan 1, 2020
- IEEE Access
Convolutional Neural Network (CNN) is widely used in Hyperspectral Images (HSIs) classification. However, the fine-grained spatial (FGS) details are discarded during a sequence of convolution and pooling operations for most of CNN-based HSIs classification methods. To address this issue, a unified encoder-decoder framework is proposed to integrate high-level semantics and FGS details for HSIs classification, denoted by FGSCNN. The encoder, including a series of convolution and pooling layers, captures the high-level semantic information with low resolution feature maps. The decoder fuses the high-level low-resolution semantic and the fine-grained high-resolution spatial information, namely, to get the FGS features with high-level semantics. The deconvolution layers and skip connection are used in the decoder to retain the FGS details, while, convolution layers are also used to combine the FGS features with high-level semantics. Based on the encoder-decoder framework, a unified loss function is exploited to integrate the high-level semantic information and FGS details with an end-to-end manner for HSIs classification. Experiments conducted on the three public datasets, i.e. the Indian Pines, Pavia University and Salinas, demonstrate the effectiveness of the proposed method on HSIs classification.
- Research Article
13
- 10.1016/j.sigpro.2023.109153
- Jun 13, 2023
- Signal Processing
MS3Net: Multiscale stratified-split symmetric network with quadra-view attention for hyperspectral image classification
- Research Article
50
- 10.1007/s41064-020-00124-x
- Sep 3, 2020
- PFG – Journal of Photogrammetry, Remote Sensing and Geoinformation Science
Over the past few decades, hyperspectral image (HSI) classification has garnered increasing attention from the remote sensing research community. The largest challenge faced by HSI classification is the high feature dimensions represented by the different HSI bands given the limited number of labeled samples. Deep learning and convolutional neural networks (CNNs), in particular, have been shown to be highly effective in several computer vision problems such as object detection and image classification. In terms of accuracy and computational cost, one of the best CNN architectures is the Inception model i.e., the winner of the ImageNet Large Scale Visual Recognition Competition (ILSVRC) 2014 challenge. Another architecture that has significantly improved image recognition performance is the Residual Network (ResNet) architecture i.e., the winner of the ILSVRC 2015 challenge. Inspired by the incredible performance introduced by the Inception and ResNet architectures, we investigate the possibility of combining the core ideas of these two models into a hybrid architecture to improve the HSI classification performance. We tested this combined model on four standard HSI datasets, and it shows competitive results compared with other existing HSI classification methods. Our hybrid deep ResNet-Inception architecture obtained accuracies of 95.31% on the Pavia University dataset, 99.02% on the Pavia Centre scenes dataset, 95.33% on the Salinas dataset and 90.57% on the Indian Pines dataset.
- Research Article
32
- 10.32604/cmes.2022.020601
- Jan 1, 2022
- Computer Modeling in Engineering & Sciences
Hyperspectral image (HSI) classification has been one of the most important tasks in the remote sensing community over the last few decades. Due to the presence of highly correlated bands and limited training samples in HSI, discriminative feature extraction was challenging for traditional machine learning methods. Recently, deep learning based methods have been recognized as powerful feature extraction tool and have drawn a significant amount of attention in HSI classification. Among various deep learning models, convolutional neural networks (CNNs) have shown huge success and offered great potential to yield high performance in HSI classification. Motivated by this successful performance, this paper presents a systematic review of different CNN architectures for HSI classification and provides some future guidelines. To accomplish this, our study has taken a few important steps. First, we have focused on different CNN architectures, which are able to extract spectral, spatial, and joint spectral-spatial features. Then, many publications related to CNN based HSI classifications have been reviewed systematically. Further, a detailed comparative performance analysis has been presented between four CNN models namely 1D CNN, 2D CNN, 3D CNN, and feature fusion based CNN (FFCNN). Four benchmark HSI datasets have been used in our experiment for evaluating the performance. Finally, we concluded the paper with challenges on CNN based HSI classification and future guidelines that may help the researchers to work on HSI classification using CNN.
- Research Article
18
- 10.3390/rs14092265
- May 8, 2022
- Remote Sensing
In recent years, hyperspectral image (HSI) classification has become a hot research direction in remote sensing image processing. Benefiting from the development of deep learning, convolutional neural networks (CNNs) have shown extraordinary achievements in HSI classification. Numerous methods combining CNNs and attention mechanisms (AMs) have been proposed for HSI classification. However, to fully mine the features of HSI, some of the previous methods apply dense connections to enhance the feature transfer between each convolution layer. Although dense connections allow these methods to fully extract features in a few training samples, it decreases the model efficiency and increases the computational cost. Furthermore, to balance model performance against complexity, the AMs in these methods compress a large number of channels or spatial resolutions during the training process, which results in a large amount of useful information being discarded. To tackle these issues, in this article, a novel one-shot dense network with polarized attention, namely, OSDN, was proposed for HSI classification. More precisely, since HSI contains rich spectral and spatial information, the OSDN has two independent branches to extract spectral and spatial features, respectively. Similarly, the polarized AMs contain two components: channel-only AMs and spatial-only AMs. Both polarized AMs can use a specially designed filtering method to reduce the complexity of the model while maintaining high internal resolution in both the channel and spatial dimensions. To verify the effectiveness and lightness of OSDN, extensive experiments were carried out on five benchmark HSI datasets, namely, Pavia University (PU), Kennedy Space Center (KSC), Botswana (BS), Houston 2013 (HS), and Salinas Valley (SV). Experimental results consistently showed that the OSDN can greatly reduce computational cost and parameters while maintaining high accuracy in a few training samples.
- Research Article
40
- 10.3390/s21051751
- Mar 3, 2021
- Sensors
Hyperspectral image (HSI) classification is the subject of intense research in remote sensing. The tremendous success of deep learning in computer vision has recently sparked the interest in applying deep learning in hyperspectral image classification. However, most deep learning methods for hyperspectral image classification are based on convolutional neural networks (CNN). Those methods require heavy GPU memory resources and run time. Recently, another deep learning model, the transformer, has been applied for image recognition, and the study result demonstrates the great potential of the transformer network for computer vision tasks. In this paper, we propose a model for hyperspectral image classification based on the transformer, which is widely used in natural language processing. Besides, we believe we are the first to combine the metric learning and the transformer model in hyperspectral image classification. Moreover, to improve the model classification performance when the available training samples are limited, we use the 1-D convolution and Mish activation function. The experimental results on three widely used hyperspectral image data sets demonstrate the proposed model’s advantages in accuracy, GPU memory cost, and running time.
- Research Article
20
- 10.3390/rs14102355
- May 13, 2022
- Remote Sensing
In recent years, hyperspectral image (HSI) classification (HSIC) methods that use deep learning have proved to be effective. In particular, the utilization of convolutional neural networks (CNNs) has proved to be highly effective. However, some key issues need to be addressed when classifying hyperspectral images (HSIs), such as small samples, which can influence the generalization ability of the CNNs and the HSIC results. To address this problem, we present a new network that integrates hybrid pyramid feature fusion and coordinate attention for enhancing small sample HSI classification results. The innovative nature of this paper lies in three main areas. Firstly, a baseline network is designed. This is a simple hybrid 3D-2D CNN. Using this baseline network, more robust spectral-spatial feature information can be obtained from the HSI. Secondly, a hybrid pyramid feature fusion mechanism is used, meaning that the feature maps of different levels and scales can be effectively fused to enhance the feature extracted by the model. Finally, coordinate attention mechanisms are utilized in the network, which can not only adaptively capture the information of the spectral dimension, but also include the direction-aware and position sensitive information. By doing this, the proposed CNN structure can extract more useful HSI features and effectively be generalized to test samples. The proposed method was shown to obtain better results than several existing methods by experimenting on three public HSI datasets.
- Research Article
183
- 10.1109/tgrs.2019.2951445
- Dec 5, 2019
- IEEE Transactions on Geoscience and Remote Sensing
Deep convolutional neural networks (CNNs) have shown their outstanding performance in the hyperspectral image (HSI) classification. The success of CNN-based HSI classification relies on the availability sufficient training samples. However, the collection of training samples is expensive and time consuming. Besides, there are many pretrained models on large-scale data sets, which extract the general and discriminative features. The proper reusage of low-level and midlevel representations will significantly improve the HSI classification accuracy. The large-scale ImageNet data set has three channels, but HSI contains hundreds of channels. Therefore, there are several difficulties to simply adapt the pretrained models for the classification of HSIs. In this article, heterogeneous transfer learning for HSI classification is proposed. First, a mapping layer is used to handle the issue of having different numbers of channels. Then, the model architectures and weights of the CNN trained on the ImageNet data sets are used to initialize the model and weights of the HSI classification network. Finally, a well-designed neural network is used to perform the HSI classification task. Furthermore, attention mechanism is used to adjust the feature maps due to the difference between the heterogeneous data sets. Moreover, controlled random sampling is used as another training sample selection method to test the effectiveness of the proposed methods. Experimental results on four popular hyperspectral data sets with two training sample selection strategies show that the transferred CNN obtains better classification accuracy than that of state-of-the-art methods. In addition, the idea of heterogeneous transfer learning may open a new window for further research.
- Conference Article
8
- 10.1109/iccece54139.2022.9712772
- Jan 14, 2022
Hyperspectral image (HSI) classification is one of the most popular applications in remote sensing. In practice, due to the high cost of manual labeling, only a few hyperspectral image samples with labels can be obtained. A small number of labeled training samples tend to overfit the deep network method, resulting in a sharp decline in classification accuracy. In order to solve this problem, this paper proposes a classification method for hyperspectral images based on knowledge distillation and heterogeneous few-shot learning. Firstly, the model pretrain the feature extraction network on miniImageNet, a small sample natural image dataset with abundant labeled images, and introduces knowledge distillation to improve the feature expression capability of shallow network in small sample. Then, effective knowledge transfer is carried out between two heterogeneous data sets, and the weights obtained from the model on the natural data set are transferred to the backbone network of hyperspectral image classification to improve the accuracy of HSI classification. Finally, the classifier is fine-tuned on HSI using the paradigm of small sample learning to extract discriminative hyperspectral image features and further enhance the model's detail expression. Experimental results on two hyperspectral image classification datasets show that the proposed method can effectively improve the accuracy of small sample hyperspectral image classification.
- Research Article
21
- 10.1109/lgrs.2021.3060876
- May 12, 2021
- IEEE Geoscience and Remote Sensing Letters
Deep learning is a powerful technique for image processing. Convolution neural network (CNN) is one of the widely used approaches for hyperspectral image (HSI) classification. These methods mostly need a time-consuming pretraining process to obtain deep features. Random patches networks (RPNets) provide a novel approach that the convolution kernel can be the original image without any pretraining process. In this letter, we propose a novel HSI classification method, multiscale random convolution broad learning system (MRC-BLS), which takes the spatial feature learning by an adaptive weighted mean filter as the convolution kernel to extract local spatial feature in the first layer. Different sizes of random convolution kernels can obtain a multiscale feature map. The weighted fusion of multiscale spatial features extracted by different sizes kernels can get better performance in HSI classification. A broad learning system (BLS) is an efficient classifier to classify images by the multiscale random feature. Experiments in three HSI data sets fully testify to the efficiency and satisfactory performance of the proposed method.
- Research Article
36
- 10.1109/tgrs.2022.3180685
- Jan 1, 2022
- IEEE Transactions on Geoscience and Remote Sensing
Hyperspectral image (HSI) classification has been a hot topic for decides, as hyperspectral images have rich spatial and spectral information and provide strong basis for distinguishing different land-cover objects. Benefiting from the development of deep learning technologies, deep learning based HSI classification methods have achieved promising performance. Recently, several neural architecture search (NAS) algorithms have been proposed for HSI classification, which further improve the accuracy of HSI classification to a new level. In this paper, NAS and Transformer are combined for handling HSI classification task for the first time. Compared with previous work, the proposed method has two main differences. First, we revisit the search spaces designed in previous HSI classification NAS methods and propose a novel hybrid search space, consisting of the space dominated cell and the spectrum dominated cell. Compared with search spaces proposed in previous works, the proposed hybrid search space is more aligned with the characteristic of HSI data, that is, HSIs have a relatively low spatial resolution and an extremely high spectral resolution. Second, to further improve the classification accuracy, we attempt to graft the emerging transformer module on the automatically designed convolutional neural network (CNN) to add global information to local region focused features learned by CNN. Experimental results on three public HSI datasets show that the proposed method achieves much better performance than comparison approaches, including manually designed network and NAS based HSI classification methods. Especially on the most recently captured dataset Houston University, overall accuracy is improved by nearly 6 percentage points. Code is available at: https://github.com/Cecilia-xue/HyT-NAS.