A Hybrid Deep ResNet and Inception Model for Hyperspectral Image Classification
Over the past few decades, hyperspectral image (HSI) classification has garnered increasing attention from the remote sensing research community. The largest challenge faced by HSI classification is the high feature dimensions represented by the different HSI bands given the limited number of labeled samples. Deep learning and convolutional neural networks (CNNs), in particular, have been shown to be highly effective in several computer vision problems such as object detection and image classification. In terms of accuracy and computational cost, one of the best CNN architectures is the Inception model i.e., the winner of the ImageNet Large Scale Visual Recognition Competition (ILSVRC) 2014 challenge. Another architecture that has significantly improved image recognition performance is the Residual Network (ResNet) architecture i.e., the winner of the ILSVRC 2015 challenge. Inspired by the incredible performance introduced by the Inception and ResNet architectures, we investigate the possibility of combining the core ideas of these two models into a hybrid architecture to improve the HSI classification performance. We tested this combined model on four standard HSI datasets, and it shows competitive results compared with other existing HSI classification methods. Our hybrid deep ResNet-Inception architecture obtained accuracies of 95.31% on the Pavia University dataset, 99.02% on the Pavia Centre scenes dataset, 95.33% on the Salinas dataset and 90.57% on the Indian Pines dataset.
- Research Article
32
- 10.32604/cmes.2022.020601
- Jan 1, 2022
- Computer Modeling in Engineering & Sciences
Hyperspectral image (HSI) classification has been one of the most important tasks in the remote sensing community over the last few decades. Due to the presence of highly correlated bands and limited training samples in HSI, discriminative feature extraction was challenging for traditional machine learning methods. Recently, deep learning based methods have been recognized as powerful feature extraction tool and have drawn a significant amount of attention in HSI classification. Among various deep learning models, convolutional neural networks (CNNs) have shown huge success and offered great potential to yield high performance in HSI classification. Motivated by this successful performance, this paper presents a systematic review of different CNN architectures for HSI classification and provides some future guidelines. To accomplish this, our study has taken a few important steps. First, we have focused on different CNN architectures, which are able to extract spectral, spatial, and joint spectral-spatial features. Then, many publications related to CNN based HSI classifications have been reviewed systematically. Further, a detailed comparative performance analysis has been presented between four CNN models namely 1D CNN, 2D CNN, 3D CNN, and feature fusion based CNN (FFCNN). Four benchmark HSI datasets have been used in our experiment for evaluating the performance. Finally, we concluded the paper with challenges on CNN based HSI classification and future guidelines that may help the researchers to work on HSI classification using CNN.
- Research Article
84
- 10.1109/tgrs.2022.3185640
- Jan 1, 2022
- IEEE Transactions on Geoscience and Remote Sensing
Convolutional Neural Networks (CNNs) have been extensively applied to hyperspectral (HS) image classification tasks and achieved promising performance. However, for CNN based HS image classification methods, it is hard to depict the dependencies among HS image pixels in long-range distanced positions and bands. Moreover, the limited receptive field of the convolutional layers extremely hinders the development of the CNN structure. To tackle these problems, in this paper, the novel Bottleneck Spatial-Spectral Transformer (BS2T) is proposed to depict the long-range global dependencies of HS image pixels, which can be regarded as a feature extraction module for HS image classification networks. More specifically, inspired by Bottleneck Transformer in computer vision, for HS image feature extraction, the proposed BS2T is incorporated with a feature contraction module, a multi-head spatial-spectral self-attention (MHS2A) module and a feature expansion module. In this way, convolutional operations are replaced by the MHS2A to capture the long-range dependency of HS pixels regardless of their spatial position and distance. Meanwhile, in the MHS2A module, to highlight the spectral features of HS images, we introduce the spectral information and content spatial positional information to classical multi-head self-attentions to make the attentions more positional aware and spectral aware. On this basis, a dual-branch HS image classification framework based on 3D CNN and BS2T is defined for jointly extracting the local-global features of HS images. Experimental results on three public HS image classification datasets show that the proposed classification framework achieves a significant improvement when comparing with the state-of-the-art methods. The source code of the proposed framework can be downloaded from https://github.com/srxlnnu/BS2T.
- Conference Article
5
- 10.3390/iecag2021-09739
- May 1, 2021
Hyperspectral images (HSI) offer detailed spectral reflectance information about sensed objects through provision of information on hundreds of narrow spectral bands. HSI have a leading role in a broad range of applications, such as in forestry, agriculture, geology, and environmental sciences. The monitoring and management of agricultural lands is of great importance for meeting the nutritional and other needs of a rapidly and continuously increasing world population. In relation to this, classification of HSI is an effective way for creating land use and land cover maps quickly and accurately. In recent years, classification of HSI using convolutional neural networks (CNN), which is a sub-field of deep learning, has become a very popular research topic and several CNN architectures have been developed by researchers. The aim of this study was to investigate the classification performance of CNN model on agricultural HSI scenes. For this purpose, a 3D-2D CNN framework and a well-known support vector machine (SVM) model were compared using the Indian Pines and Salinas Scene datasets that contain crop and mixed vegetation classes. As a result of this study, it was confirmed that use of 3D-2D CNN offers superior performance for classifying agricultural HSI datasets.
- Conference Article
1
- 10.1145/3641584.3641609
- Sep 22, 2023
With the continuous innovation in deep learning, it has become a major direction for scholars to introduce the knowledge of deep learning into hyperspectral image classification to enhance its classification accuracy. Convolutional Neural Networks (CNN) are one of the most commonly used deep learning-based visual data processing methods, and are widely used in hyperspectral image (HSI) classification by virtue of their excellent contextual modeling capability. Since the performance of HSI classification is highly dependent on spatial and spectral information, this paper proposes a hyperspectral image classification method using 3D attention mechanism in collaboration with Transformer for hyperspectral image classification in view of the problems that the current hyperspectral image classification models with the framework of CNN have insufficient spatial spectral feature extraction and fail to excavate and represent the sequence properties of spectral features well. In this paper, we introduce a variant Transformer model based on a hybrid model of both improved 3D-CNN and 2D-CNN, combining complementary information of spatial spectrum and spectra in the form of 3D convolution and 2D convolution on CNN, and adding a variant attention mechanism module to strengthen spatial texture features, while combining grouped transfer Transformer to jump connection to enable the lower layer to better learn the upper layer features. Firstly, a variant channel attention mechanism is introduced on 3D-CNN to enhance the acquisition of spectral information of image features by 3D-CNN. Secondly, a variant spatial attention mechanism is introduced to enable 3D-CNN to better acquire the spatial information of hyperspectral images in the network, and subsequently the acquired spatial and spectral feature information is passed to 2D-CNN to enable it to better acquire local feature information. Finally, the acquired image feature information is passed to the variant Transformer model to make up for the fact that CNN can only acquire hyperspectral image features in local contexts, enabling it to better acquire global feature information on feature sequences. The experimental results show that the proposed model is experimented on two hyperspectral datasets, Indian Pines and Pavia University, and the overall classification accuracy (OA), average classification accuracy (AA), and Kappa coefficient reach up to 99.59%, 99.31%, and 99.45%, respectively, on the PU dataset, compared with the current cutting-edge techniques. The classification accuracy has been improved.
- Research Article
19
- 10.1016/j.eswa.2023.122202
- Oct 17, 2023
- Expert Systems with Applications
Hyperspectral image classification using Second-Order Pooling with Graph Residual Unit Network
- Research Article
192
- 10.1109/tgrs.2019.2910603
- Sep 1, 2019
- IEEE Transactions on Geoscience and Remote Sensing
Hyperspectral image (HSI) classification is a core task in the remote sensing community, and recently, deep learning-based methods have shown their capability of accurate classification of HSIs. Among the deep learning-based methods, deep convolutional neural networks (CNNs) have been widely used for the HSI classification. In order to obtain a good classification performance, substantial efforts are required to design a proper deep learning architecture. Furthermore, the manually designed architecture may not fit a specific data set very well. In this paper, the idea of automatic CNN for the HSI classification is proposed for the first time. First, a number of operations, including convolution, pooling, identity, and batch normalization, are selected. Then, a gradient descent-based search algorithm is used to effectively find the optimal deep architecture that is evaluated on the validation data set. After that, the best CNN architecture is selected as the model for the HSI classification. Specifically, the automatic 1-D Auto-CNN and 3-D Auto-CNN are used as spectral and spectral-spatial HSI classifiers, respectively. Furthermore, the cutout is introduced as a regularization technique for the HSI spectral-spatial classification to further improve the classification accuracy. The experiments on four widely used hyperspectral data sets (i.e., Salinas, Pavia University, Kennedy Space Center, and Indiana Pines) show that the automatically designed data-dependent CNNs obtain competitive classification accuracy compared with the state-of-the-art methods. In addition, the automatic design of the deep learning architecture opens a new window for future research, showing the huge potential of using neural architectures' optimization capabilities for the accurate HSI classification.
- Research Article
36
- 10.1109/tgrs.2022.3180685
- Jan 1, 2022
- IEEE Transactions on Geoscience and Remote Sensing
Hyperspectral image (HSI) classification has been a hot topic for decides, as hyperspectral images have rich spatial and spectral information and provide strong basis for distinguishing different land-cover objects. Benefiting from the development of deep learning technologies, deep learning based HSI classification methods have achieved promising performance. Recently, several neural architecture search (NAS) algorithms have been proposed for HSI classification, which further improve the accuracy of HSI classification to a new level. In this paper, NAS and Transformer are combined for handling HSI classification task for the first time. Compared with previous work, the proposed method has two main differences. First, we revisit the search spaces designed in previous HSI classification NAS methods and propose a novel hybrid search space, consisting of the space dominated cell and the spectrum dominated cell. Compared with search spaces proposed in previous works, the proposed hybrid search space is more aligned with the characteristic of HSI data, that is, HSIs have a relatively low spatial resolution and an extremely high spectral resolution. Second, to further improve the classification accuracy, we attempt to graft the emerging transformer module on the automatically designed convolutional neural network (CNN) to add global information to local region focused features learned by CNN. Experimental results on three public HSI datasets show that the proposed method achieves much better performance than comparison approaches, including manually designed network and NAS based HSI classification methods. Especially on the most recently captured dataset Houston University, overall accuracy is improved by nearly 6 percentage points. Code is available at: https://github.com/Cecilia-xue/HyT-NAS.
- Research Article
1
- 10.25932/publishup-52057
- Jan 1, 2021
- publish.UP (University of Potsdam)
In recent years, deep learning improved the way remote sensing data is processed. The classification of hyperspectral data is no exception. 2D or 3D convolutional neural networks have outperformed classical algorithms on hyperspectral image classification in many cases. However, geological hyperspectral image classification includes several challenges, often including spatially more complex objects than found in other disciplines of hyperspectral imaging that have more spatially similar objects (e.g., as in industrial applications, aerial urban- or farming land cover types). In geological hyperspectral image classification, classical algorithms that focus on the spectral domain still often show higher accuracy, more sensible results, or flexibility due to spatial information independence. In the framework of this thesis, inspired by classical machine learning algorithms that focus on the spectral domain like the binary feature fitting- (BFF) and the EnGeoMap algorithm, the author of this thesis proposes, develops, tests, and discusses a novel, spectrally focused, spatial information independent, deep multi-layer convolutional neural network, named 'DeepGeoMap’, for hyperspectral geological data classification. More specifically, the architecture of DeepGeoMap uses a sequential series of different 1D convolutional neural networks layers and fully connected dense layers and utilizes rectified linear unit and softmax activation, 1D max and 1D global average pooling layers, additional dropout to prevent overfitting, and a categorical cross-entropy loss function with Adam gradient descent optimization. DeepGeoMap was realized using Python 3.7 and the machine and deep learning interface TensorFlow with graphical processing unit (GPU) acceleration. This 1D spectrally focused architecture allows DeepGeoMap models to be trained with hyperspectral laboratory image data of geochemically validated samples (e.g., ground truth samples for aerial or mine face images) and then use this laboratory trained model to classify other or larger scenes, similar to classical algorithms that use a spectral library of validated samples for image classification. The classification capabilities of DeepGeoMap have been tested using two geological hyperspectral image data sets. Both are geochemically validated hyperspectral data sets one based on iron ore and the other based on copper ore samples. The copper ore laboratory data set was used to train a DeepGeoMap model for the classification and analysis of a larger mine face scene within the Republic of Cyprus, where the samples originated from. Additionally, a benchmark satellite-based dataset, the Indian Pines data set, was used for training and testing. The classification accuracy of DeepGeoMap was compared to classical algorithms and other convolutional neural networks. It was shown that DeepGeoMap could achieve higher accuracies and outperform these classical algorithms and other neural networks in the geological hyperspectral image classification test cases. The spectral focus of DeepGeoMap was found to be the most considerable advantage compared to spectral-spatial classifiers like 2D or 3D neural networks. This enables DeepGeoMap models to train data independently of different spatial entities, shapes, and/or resolutions.
- Research Article
40
- 10.1109/access.2019.2957163
- Jan 1, 2019
- IEEE Access
Recently, convolutional neural networks (CNNs) have been introduced for hyperspectral image (HSI) classification and shown considerable classification performance. However, the previous CNNs designed for spectral-spatial HSI classification lay stress on the learning for the spatial correlation of HSI data and neglect the channel responses of feature maps. Furthermore, the lack of training samples remains the major challenge for CNN-based HSI classification methods to achieve better performance. To address the aforementioned issues, this paper proposes a new end-to-end pre-activation residual attention network (PRAN) for HSI classification. The pre-activation mechanism and attention mechanism are introduced into the proposed network, and a pre-activation residual attention block (PRAB) is designed, which allows the proposed network to carry adaptively feature recalibration of channel responses and learn more robust spectral-spatial joint feature representations. The proposed PRAN is equipped with two PRABs and several convolutional layers with different kernel sizes, which enables the PRAN to extract high-level discriminative features. Experimental results on three benchmark HSI datasets reveal that the proposed method is provided with competitive performance over several state-of-the-art HSI classification methods, especially when the training set size is relatively small.
- Research Article
28
- 10.1016/j.sigpro.2024.109669
- Aug 22, 2024
- Signal Processing
State space models meet transformers for hyperspectral image classification
- Research Article
183
- 10.1109/tgrs.2019.2951445
- Dec 5, 2019
- IEEE Transactions on Geoscience and Remote Sensing
Deep convolutional neural networks (CNNs) have shown their outstanding performance in the hyperspectral image (HSI) classification. The success of CNN-based HSI classification relies on the availability sufficient training samples. However, the collection of training samples is expensive and time consuming. Besides, there are many pretrained models on large-scale data sets, which extract the general and discriminative features. The proper reusage of low-level and midlevel representations will significantly improve the HSI classification accuracy. The large-scale ImageNet data set has three channels, but HSI contains hundreds of channels. Therefore, there are several difficulties to simply adapt the pretrained models for the classification of HSIs. In this article, heterogeneous transfer learning for HSI classification is proposed. First, a mapping layer is used to handle the issue of having different numbers of channels. Then, the model architectures and weights of the CNN trained on the ImageNet data sets are used to initialize the model and weights of the HSI classification network. Finally, a well-designed neural network is used to perform the HSI classification task. Furthermore, attention mechanism is used to adjust the feature maps due to the difference between the heterogeneous data sets. Moreover, controlled random sampling is used as another training sample selection method to test the effectiveness of the proposed methods. Experimental results on four popular hyperspectral data sets with two training sample selection strategies show that the transferred CNN obtains better classification accuracy than that of state-of-the-art methods. In addition, the idea of heterogeneous transfer learning may open a new window for further research.
- Research Article
145
- 10.1016/j.jag.2021.102603
- Nov 6, 2021
- International Journal of Applied Earth Observation and Geoinformation
Hyperspectral image classification on insufficient-sample and feature learning using deep neural networks: A review
- Research Article
8
- 10.1080/01431161.2024.2370501
- Jul 5, 2024
- International Journal of Remote Sensing
Graph neural networks (GNNs) have recently garnered significant attention due to their exceptional performance across various applications, including hyperspectral (HS) image classification. However, most existing GNN-based models for HS image classification are limited depth models and often suffer from performance degradation as model depth increases. This study introduces HyperGCN, an exclusive GNN-based model designed with multiple graph convolutional layers to exploit the rich spectral information inherent in HS images, thereby enhancing classification performance. To address performance degradation, HyperGCN incorporates techniques resistant to oversmoothing into its architecture. Additionally, multiple-side exit branches are integrated into the intermediate layers of HyperGCN, enabling dynamic management of the complexity of HS images. Less complex HS images are processed by fewer layers, exiting early via attached branches, while more complex images traverse multiple layers until reaching the final output layer. Extensive experiments on four benchmark HS datasets (Indian Pines, Pavia University, Salinas, and Botswana) demonstrate HyperGCN’s superior performance over basic GNN-based models. Notably, HyperGCN outperforms or performs comparably to the CNN-GNN combined model in classifying HS images. Furthermore, the superior performance of multi-exit HyperGCN over its single-exit counterpart emphasizes the effectiveness of incorporating side exit branches in GNN-based HS image classification. Compared to state-of-the-art models, multi-exit HyperGCN demonstrates competitive performance, highlighting its effectiveness in handling complex spectral information in HS images while maintaining an acceptable balance between accuracy and computational efficiency.
- Research Article
40
- 10.3390/s21051751
- Mar 3, 2021
- Sensors
Hyperspectral image (HSI) classification is the subject of intense research in remote sensing. The tremendous success of deep learning in computer vision has recently sparked the interest in applying deep learning in hyperspectral image classification. However, most deep learning methods for hyperspectral image classification are based on convolutional neural networks (CNN). Those methods require heavy GPU memory resources and run time. Recently, another deep learning model, the transformer, has been applied for image recognition, and the study result demonstrates the great potential of the transformer network for computer vision tasks. In this paper, we propose a model for hyperspectral image classification based on the transformer, which is widely used in natural language processing. Besides, we believe we are the first to combine the metric learning and the transformer model in hyperspectral image classification. Moreover, to improve the model classification performance when the available training samples are limited, we use the 1-D convolution and Mish activation function. The experimental results on three widely used hyperspectral image data sets demonstrate the proposed model’s advantages in accuracy, GPU memory cost, and running time.
- Research Article
8
- 10.1080/01431161.2022.2048916
- Mar 4, 2022
- International Journal of Remote Sensing
Existing graph-based, semi-supervised hyperspectral image (HSI) classification models often suffer from prolonged execution time due to high computational complexity. In this work, we first propose a fast anchor graph regularization (FAGR) model for large scale, HSI classification. FAGR employs a simple anchor-based graph construction procedure and a new adjacency matrix among anchors to dramatically reduce the computational complexity while attaining good classification performance. In order to further improve the classification accuracy of hyperspectral images, we propose a novel semi-supervised anchor graph ensemble (SAGE) model. SAGE is an ensemble realization of multiple FAGR with each component FAGR operating on a randomly selected subset of features. Ameta classifier is applied to aggregate the outputs of component classifiers to yield an ensemble classification result. We performed extensive experimentations using three real-world HSI datasets, to compare the performance of FAGR and SAGE against several existing graph-based HSI classifiers. The experiment results show that the proposed SAGE achieves 95.78% classification accuracy on the Indian Pines dataset using limited labeled samples, out-performing existing models in terms of shorter execution time and better classification accuracy.