Accelerate Literature Icon
Want to do a literature review? Try our new Literature Review workflow

A Novel Classification Framework for Hyperspectral Image Data by Improved Multilayer Perceptron Combined with Residual Network

  • Abstract
  • PDF
  • Literature Map
  • Similar Papers
Abstract
Translate article icon Translate Article Star icon

Convolutional neural networks (CNNs) have attracted extensive attention in the field of modern remote sensing image processing and show outstanding performance in hyperspectral image (HSI) classification. Nevertheless, some hyperspectral images have fixed position priors and parameter sharing between different positions, so the common convolution layer may ignore some important fine and useful information and cannot guarantee to effectively capture the optimal image features. This paper proposes an improved multilayer perceptron (IMLP) and IMLP combined with ResNet (IMLP-ResNet) two models for HSI classification. Combined with the characteristics of hyperspectral data, we design IMLP based on three improvements. Specifically, a depthwise over-parameterized convolutional layer is introduced to increase learnable parameters of the model in IMLP, which speeds up the convergence of the model without increasing the computational complexity. Secondly, a Focal Loss function is used to suppress the useless ones in the classification task and enhance the critical spectral–spatial features, which allow the IMLP network to learn more useful hyperspectral image information. Furthermore, to enhance the convergence speed of the network, cosine annealing is introduced to further improve the training performance of IMLP. Furthermore, the IMLP module is combined with a residual network (IMLP-ResNet) to construct a symmetric structure, which extracts more advanced semantic information from hyperspectral images. The proposed IMLP and IMLP-ResNet are tested on the two public HSI datasets (i.e., Indian Pines and Pavia University) and a real hyperspectral dataset (Xuzhou). Experimental results demonstrate the superiority of the proposed IMLP-ResNet method over several state-of-the-art methods with the highest OA, which outperforms CNN by 8.19%, 6.28%, 5.59% and outperforms ResNet by 3.52%, 3.54%, 2.67% on Indian Pines, Pavia University and Xuzhou datasets, respectively, and demonstrates that the well-designed MLPs can also obtain remarkable classification performance of HSI.

Similar Papers
  • Research Article
  • Cite Count Icon 8
  • 10.1080/01431161.2024.2370501
HyperGCN – a multi-layer multi-exit graph neural network to enhance hyperspectral image classification
  • Jul 5, 2024
  • International Journal of Remote Sensing
  • Haseena Rahmath P + 3 more

Graph neural networks (GNNs) have recently garnered significant attention due to their exceptional performance across various applications, including hyperspectral (HS) image classification. However, most existing GNN-based models for HS image classification are limited depth models and often suffer from performance degradation as model depth increases. This study introduces HyperGCN, an exclusive GNN-based model designed with multiple graph convolutional layers to exploit the rich spectral information inherent in HS images, thereby enhancing classification performance. To address performance degradation, HyperGCN incorporates techniques resistant to oversmoothing into its architecture. Additionally, multiple-side exit branches are integrated into the intermediate layers of HyperGCN, enabling dynamic management of the complexity of HS images. Less complex HS images are processed by fewer layers, exiting early via attached branches, while more complex images traverse multiple layers until reaching the final output layer. Extensive experiments on four benchmark HS datasets (Indian Pines, Pavia University, Salinas, and Botswana) demonstrate HyperGCN’s superior performance over basic GNN-based models. Notably, HyperGCN outperforms or performs comparably to the CNN-GNN combined model in classifying HS images. Furthermore, the superior performance of multi-exit HyperGCN over its single-exit counterpart emphasizes the effectiveness of incorporating side exit branches in GNN-based HS image classification. Compared to state-of-the-art models, multi-exit HyperGCN demonstrates competitive performance, highlighting its effectiveness in handling complex spectral information in HS images while maintaining an acceptable balance between accuracy and computational efficiency.

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 11
  • 10.1109/access.2020.2974025
An Encoder–Decoder Convolution Network With Fine-Grained Spatial Information for Hyperspectral Images Classification
  • Jan 1, 2020
  • IEEE Access
  • Zhongwei Li + 4 more

Convolutional Neural Network (CNN) is widely used in Hyperspectral Images (HSIs) classification. However, the fine-grained spatial (FGS) details are discarded during a sequence of convolution and pooling operations for most of CNN-based HSIs classification methods. To address this issue, a unified encoder-decoder framework is proposed to integrate high-level semantics and FGS details for HSIs classification, denoted by FGSCNN. The encoder, including a series of convolution and pooling layers, captures the high-level semantic information with low resolution feature maps. The decoder fuses the high-level low-resolution semantic and the fine-grained high-resolution spatial information, namely, to get the FGS features with high-level semantics. The deconvolution layers and skip connection are used in the decoder to retain the FGS details, while, convolution layers are also used to combine the FGS features with high-level semantics. Based on the encoder-decoder framework, a unified loss function is exploited to integrate the high-level semantic information and FGS details with an end-to-end manner for HSIs classification. Experiments conducted on the three public datasets, i.e. the Indian Pines, Pavia University and Salinas, demonstrate the effectiveness of the proposed method on HSIs classification.

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 32
  • 10.3390/mi12050545
Hybrid Dilated Convolution with Multi-Scale Residual Fusion Network for Hyperspectral Image Classification.
  • May 10, 2021
  • Micromachines
  • Chenming Li + 5 more

The convolutional neural network (CNN) has been proven to have better performance in hyperspectral image (HSI) classification than traditional methods. Traditional CNN on hyperspectral image classification is used to pay more attention to spectral features and ignore spatial information. In this paper, a new HSI model called local and hybrid dilated convolution fusion network (LDFN) was proposed, which fuses the local information of details and rich spatial features by expanding the perception field. The details of our local and hybrid dilated convolution fusion network methods are as follows. First, many operations are selected, such as standard convolution, average pooling, dropout and batch normalization. Then, fusion operations of local and hybrid dilated convolution are included to extract rich spatial-spectral information. Last, different convolution layers are gathered into residual fusion networks and finally input into the softmax layer to classify. Three widely hyperspectral datasets (i.e., Salinas, Pavia University and Indian Pines) have been used in the experiments, which show that LDFN outperforms state-of-art classifiers.

  • Conference Article
  • Cite Count Icon 1
  • 10.1145/3641584.3641609
Hyperspectral Image Classification Using 3D Attention Mechanism in Collaboration with Transformer
  • Sep 22, 2023
  • Yubing Wang + 2 more

With the continuous innovation in deep learning, it has become a major direction for scholars to introduce the knowledge of deep learning into hyperspectral image classification to enhance its classification accuracy. Convolutional Neural Networks (CNN) are one of the most commonly used deep learning-based visual data processing methods, and are widely used in hyperspectral image (HSI) classification by virtue of their excellent contextual modeling capability. Since the performance of HSI classification is highly dependent on spatial and spectral information, this paper proposes a hyperspectral image classification method using 3D attention mechanism in collaboration with Transformer for hyperspectral image classification in view of the problems that the current hyperspectral image classification models with the framework of CNN have insufficient spatial spectral feature extraction and fail to excavate and represent the sequence properties of spectral features well. In this paper, we introduce a variant Transformer model based on a hybrid model of both improved 3D-CNN and 2D-CNN, combining complementary information of spatial spectrum and spectra in the form of 3D convolution and 2D convolution on CNN, and adding a variant attention mechanism module to strengthen spatial texture features, while combining grouped transfer Transformer to jump connection to enable the lower layer to better learn the upper layer features. Firstly, a variant channel attention mechanism is introduced on 3D-CNN to enhance the acquisition of spectral information of image features by 3D-CNN. Secondly, a variant spatial attention mechanism is introduced to enable 3D-CNN to better acquire the spatial information of hyperspectral images in the network, and subsequently the acquired spatial and spectral feature information is passed to 2D-CNN to enable it to better acquire local feature information. Finally, the acquired image feature information is passed to the variant Transformer model to make up for the fact that CNN can only acquire hyperspectral image features in local contexts, enabling it to better acquire global feature information on feature sequences. The experimental results show that the proposed model is experimented on two hyperspectral datasets, Indian Pines and Pavia University, and the overall classification accuracy (OA), average classification accuracy (AA), and Kappa coefficient reach up to 99.59%, 99.31%, and 99.45%, respectively, on the PU dataset, compared with the current cutting-edge techniques. The classification accuracy has been improved.

  • Research Article
  • Cite Count Icon 1
  • 10.3390/photonics12020146
3DVT: Hyperspectral Image Classification Using 3D Dilated Convolution and Mean Transformer
  • Feb 11, 2025
  • Photonics
  • Xinling Su + 1 more

Hyperspectral imaging and laser technology both rely on different wavelengths of light to analyze the characteristics of materials, revealing their composition, state, or structure through precise spectral data. In hyperspectral image (HSI) classification tasks, the limited number of labeled samples and the lack of feature extraction diversity often lead to suboptimal classification performance. Furthermore, traditional convolutional neural networks (CNNs) primarily focus on local features in hyperspectral data, neglecting long-range dependencies and global context. To address these challenges, this paper proposes a novel model that combines CNNs with an average pooling Vision Transformer (ViT) for hyperspectral image classification. The model utilizes three-dimensional dilated convolution and two-dimensional convolution to extract multi-scale spatial–spectral features, while ViT was employed to capture global features and long-range dependencies in the hyperspectral data. Unlike the traditional ViT encoder, which uses linear projection, our model replaces it with average pooling projection. This change enhances the extraction of local features and compensates for the ViT encoder’s limitations in local feature extraction. This hybrid approach effectively combines the local feature extraction strengths of CNNs with the long-range dependency handling capabilities of Transformers, significantly improving overall performance in hyperspectral image classification tasks. Additionally, the proposed method holds promise for the classification of fiber laser spectra, where high precision and spectral analysis are crucial for distinguishing between different fiber laser characteristics. Experimental results demonstrate that the CNN-Transformer model substantially improves classification accuracy on three benchmark hyperspectral datasets. The overall accuracies achieved on the three public datasets—IP, PU, and SV—were 99.35%, 99.31%, and 99.66%, respectively. These advancements offer potential benefits for a wide range of applications, including high-performance optical fiber sensing, laser medicine, and environmental monitoring, where accurate spectral classification is essential for the development of advanced systems in fields such as laser medicine and optical fiber technology.

  • Research Article
  • Cite Count Icon 10
  • 10.1016/j.infrared.2024.105401
Hyperspectral image classification based on deep separable residual attention network
  • Jun 17, 2024
  • Infrared Physics and Technology
  • Chao Tu + 3 more

Hyperspectral image classification based on deep separable residual attention network

  • Conference Article
  • Cite Count Icon 82
  • 10.1109/ictai.2016.0158
Cube-CNN-SVM: A Novel Hyperspectral Image Classification Method
  • Nov 1, 2016
  • Jiabing Leng + 4 more

CNNs (convolutional neural networks) have been proved to be efficient deep learning models that can directly extract high level features from raw data. In this paper, a novel CCS (Cube-CNN-SVM) method is proposed for hyperspectral image classification, which is a spectral-spatial feature based hybrid model of CNN and SVM (support vector machine). Different from most of traditional methods that only take spectral information into consideration, a target pixel and the spectral information of its neighbors are organized into a spectral-spatial multi-feature cube used in hyperspectral image classification. It is a straightforward but valid spatial strategy that can easily improve classification accuracy without extra modification of deep CNN's structure except the size of input layer and convolutional kernel. Our deep CNN consists of the input layer, convolutional layer, max pooling layer, full connection layer and output layer. To further improve hyperspectral image classification accuracy, SVM is trained as hyperspectral image classifier with the features extracted by deep CNN from spectral-spatial fusion information. Three hyperspectral image datasets such as the KSC (Kennedy Space Center), PU (Pavia University Scene) and Indian Pines are used to evaluate the performance of CCS method. Experimental results indicate that the hyperspectral image classification can be improved efficiently with the spectral-spatial fusion strategy and CCS method. Firstly, it is easy to implement the spatial strategy to improve classification accuracy about 4% compared with only spectral information used for classification, in which 98.49% is gained on the KSC dataset. Secondly, CCS method can further improve classification accuracy about 1%~3% compared to the best performance of deep CNN, in which 99.45% is gained on the PU dataset.

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 24
  • 10.3390/rs14153705
A Hyperspectral Image Classification Method Based on Adaptive Spectral Spatial Kernel Combined with Improved Vision Transformer
  • Aug 2, 2022
  • Remote Sensing
  • Aili Wang + 4 more

In recent years, methods based on deep convolutional neural networks (CNNs) have dominated the classification task of hyperspectral images. Although CNN-based HSI classification methods have the advantages of spatial feature extraction, HSI images are characterized by approximately continuous spectral information, usually containing hundreds of spectral bands. CNN cannot mine and represent the sequence properties of spectral features well, and the transformer model of attention mechanism proves its advantages in processing sequence data. This study proposes a new spectral spatial kernel combined with the improved Vision Transformer (ViT) to jointly extract spatial spectral features to complete classification task. First, the hyperspectral data are dimensionally reduced by PCA; then, the shallow features are extracted with an spectral spatial kernel, and the extracted features are input into the improved ViT model. The improved ViT introduces a re-attention mechanism and a local mechanism based on the original ViT. The re-attention mechanism can increase the diversity of attention maps at different levels. The local mechanism is introduced into ViT to make full use of the local and global information of the data to improve the classification accuracy. Finally, a multi-layer perceptron is used to obtain the classification result. Among them, the Focal Loss function is used to increase the loss weight of small-class samples and difficult-to-classify samples in HSI data samples and reduce the loss weight of easy-to-classify samples, so that the network can learn more useful hyperspectral image information. In addition, using the Apollo optimizer to train the HSI classification model to better update and compute network parameters that affect model training and model output, thereby minimizing the loss function. We evaluated the classification performance of the proposed method on four different datasets, and achieved good classification results on urban land object classification, crop classification and mineral classification, respectively. Compared with the state-of-the-art backbone network, the method achieves a significant improvement and achieves very good classification accuracy.

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 2
  • 10.3390/jimaging9070141
An Effective Hyperspectral Image Classification Network Based on Multi-Head Self-Attention and Spectral-Coordinate Attention
  • Jul 10, 2023
  • Journal of Imaging
  • Minghua Zhang + 4 more

In hyperspectral image (HSI) classification, convolutional neural networks (CNNs) have been widely employed and achieved promising performance. However, CNN-based methods face difficulties in achieving both accurate and efficient HSI classification due to their limited receptive fields and deep architectures. To alleviate these limitations, we propose an effective HSI classification network based on multi-head self-attention and spectral-coordinate attention (MSSCA). Specifically, we first reduce the redundant spectral information of HSI by using a point-wise convolution network (PCN) to enhance discriminability and robustness of the network. Then, we capture long-range dependencies among HSI pixels by introducing a modified multi-head self-attention (M-MHSA) model, which applies a down-sampling operation to alleviate the computing burden caused by the dot-product operation of MHSA. Furthermore, to enhance the performance of the proposed method, we introduce a lightweight spectral-coordinate attention fusion module. This module combines spectral attention (SA) and coordinate attention (CA) to enable the network to better weight the importance of useful bands and more accurately localize target objects. Importantly, our method achieves these improvements without increasing the complexity or computational cost of the network. To demonstrate the effectiveness of our proposed method, experiments were conducted on three classic HSI datasets: Indian Pines (IP), Pavia University (PU), and Salinas. The results show that our proposed method is highly competitive in terms of both efficiency and accuracy when compared to existing methods.

  • Research Article
  • Cite Count Icon 1
  • 10.25932/publishup-52057
DeepGeoMap : a deep learning convolutional neural network architecture for geological hyperspectral classification and mapping
  • Jan 1, 2021
  • publish.UP (University of Potsdam)
  • Helge Leoard Carl Dämpfling

In recent years, deep learning improved the way remote sensing data is processed. The classification of hyperspectral data is no exception. 2D or 3D convolutional neural networks have outperformed classical algorithms on hyperspectral image classification in many cases. However, geological hyperspectral image classification includes several challenges, often including spatially more complex objects than found in other disciplines of hyperspectral imaging that have more spatially similar objects (e.g., as in industrial applications, aerial urban- or farming land cover types). In geological hyperspectral image classification, classical algorithms that focus on the spectral domain still often show higher accuracy, more sensible results, or flexibility due to spatial information independence. In the framework of this thesis, inspired by classical machine learning algorithms that focus on the spectral domain like the binary feature fitting- (BFF) and the EnGeoMap algorithm, the author of this thesis proposes, develops, tests, and discusses a novel, spectrally focused, spatial information independent, deep multi-layer convolutional neural network, named 'DeepGeoMap’, for hyperspectral geological data classification. More specifically, the architecture of DeepGeoMap uses a sequential series of different 1D convolutional neural networks layers and fully connected dense layers and utilizes rectified linear unit and softmax activation, 1D max and 1D global average pooling layers, additional dropout to prevent overfitting, and a categorical cross-entropy loss function with Adam gradient descent optimization. DeepGeoMap was realized using Python 3.7 and the machine and deep learning interface TensorFlow with graphical processing unit (GPU) acceleration. This 1D spectrally focused architecture allows DeepGeoMap models to be trained with hyperspectral laboratory image data of geochemically validated samples (e.g., ground truth samples for aerial or mine face images) and then use this laboratory trained model to classify other or larger scenes, similar to classical algorithms that use a spectral library of validated samples for image classification. The classification capabilities of DeepGeoMap have been tested using two geological hyperspectral image data sets. Both are geochemically validated hyperspectral data sets one based on iron ore and the other based on copper ore samples. The copper ore laboratory data set was used to train a DeepGeoMap model for the classification and analysis of a larger mine face scene within the Republic of Cyprus, where the samples originated from. Additionally, a benchmark satellite-based dataset, the Indian Pines data set, was used for training and testing. The classification accuracy of DeepGeoMap was compared to classical algorithms and other convolutional neural networks. It was shown that DeepGeoMap could achieve higher accuracies and outperform these classical algorithms and other neural networks in the geological hyperspectral image classification test cases. The spectral focus of DeepGeoMap was found to be the most considerable advantage compared to spectral-spatial classifiers like 2D or 3D neural networks. This enables DeepGeoMap models to train data independently of different spatial entities, shapes, and/or resolutions.

  • Research Article
  • Cite Count Icon 9
  • 10.1117/1.jrs.15.042612
Hyperspectral image classification using deep convolutional neural network and stochastic relaxation labeling
  • Oct 5, 2021
  • Journal of Applied Remote Sensing
  • Manoj K Singh + 2 more

Convolutional neural networks (CNNs) have shown tremendous success for hyperspectral image classification in recent years. CNNs are capable of capturing multi-scale spectral–spatial characteristics of hyperspectral image pixels leading to good classification results. Despite the good accuracy, most of the classifiers misclassify some pixels and generate noisy classification maps. A deep CNN and Markov random field (MRF)-based two-stage classification framework is developed for hyperspectral images. The input image is first classified with the help of a deep CNN classifier. The results provided by CNN are further refined by applying stochastic relaxation labeling using MRF on the first-stage classification map to produce a refined classification map with better accuracy. This two-stage classification approach is particularly helpful if smaller misclassified regions are generated during the first-stage classification. Experiments are performed on one satellite-borne and three airborne hyperspectral images: Dioni, Indian Pines, Pavia University, and Salinas. The results show that the proposed method yields good classification accuracy and smoothed classification maps. The refinement by MRF relaxation improved the overall classification accuracy of the first-stage classifier by more than 2% for all the images. The overall classification accuracy in terms of κ coefficient is obtained as 0.9844, 0.9678, 0.9843, and 0.9841 for Dioni, Indian Pines, Pavia University, and Salinas images, respectively, which is comparable or better than several existing methods.

  • Research Article
  • Cite Count Icon 18
  • 10.1016/j.eswa.2024.123155
Hyperspectral image classification based on a novel Lush multi-layer feature fusion bias network
  • Jan 18, 2024
  • Expert Systems with Applications
  • Cuiping Shi + 2 more

Hyperspectral image classification based on a novel Lush multi-layer feature fusion bias network

  • Research Article
  • Cite Count Icon 30
  • 10.1016/j.engappai.2024.108669
MSTSENet: Multiscale Spectral–Spatial Transformer with Squeeze and Excitation network for hyperspectral image classification
  • May 30, 2024
  • Engineering Applications of Artificial Intelligence
  • Irfan Ahmad + 4 more

MSTSENet: Multiscale Spectral–Spatial Transformer with Squeeze and Excitation network for hyperspectral image classification

  • PDF Download Icon
  • Research Article
  • 10.1371/journal.pone.0300013.r006
Attention 3D central difference convolutional dense network for hyperspectral image classification
  • Apr 10, 2024
  • PLOS ONE
  • Mahmood Ashraf + 9 more

Hyperspectral Images (HSI) classification is a challenging task due to a large number of spatial-spectral bands of images with high inter-similarity, extra variability classes, and complex region relationships, including overlapping and nested regions. Classification becomes a complex problem in remote sensing images like HSIs. Convolutional Neural Networks (CNNs) have gained popularity in addressing this challenge by focusing on HSI data classification. However, the performance of 2D-CNN methods heavily relies on spatial information, while 3D-CNN methods offer an alternative approach by considering both spectral and spatial information. Nonetheless, the computational complexity of 3D-CNN methods increases significantly due to the large capacity size and spectral dimensions. These methods also face difficulties in manipulating information from local intrinsic detailed patterns of feature maps and low-rank frequency feature tuning. To overcome these challenges and improve HSI classification performance, we propose an innovative approach called the Attention 3D Central Difference Convolutional Dense Network (3D-CDC Attention DenseNet). Our 3D-CDC method leverages the manipulation of local intrinsic detailed patterns in the spatial-spectral features maps, utilizing pixel-wise concatenation and spatial attention mechanism within a dense strategy to incorporate low-rank frequency features and guide the feature tuning. Experimental results on benchmark datasets such as Pavia University, Houston 2018, and Indian Pines demonstrate the superiority of our method compared to other HSI classification methods, including state-of-the-art techniques. The proposed method achieved 97.93% overall accuracy on the Houston-2018, 99.89% on Pavia University, and 99.38% on the Indian Pines dataset with the 25 × 25 window size.

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 8
  • 10.1371/journal.pone.0300013
Attention 3D central difference convolutional dense network for hyperspectral image classification.
  • Apr 10, 2024
  • PloS one
  • Mahmood Ashraf + 5 more

Hyperspectral Images (HSI) classification is a challenging task due to a large number of spatial-spectral bands of images with high inter-similarity, extra variability classes, and complex region relationships, including overlapping and nested regions. Classification becomes a complex problem in remote sensing images like HSIs. Convolutional Neural Networks (CNNs) have gained popularity in addressing this challenge by focusing on HSI data classification. However, the performance of 2D-CNN methods heavily relies on spatial information, while 3D-CNN methods offer an alternative approach by considering both spectral and spatial information. Nonetheless, the computational complexity of 3D-CNN methods increases significantly due to the large capacity size and spectral dimensions. These methods also face difficulties in manipulating information from local intrinsic detailed patterns of feature maps and low-rank frequency feature tuning. To overcome these challenges and improve HSI classification performance, we propose an innovative approach called the Attention 3D Central Difference Convolutional Dense Network (3D-CDC Attention DenseNet). Our 3D-CDC method leverages the manipulation of local intrinsic detailed patterns in the spatial-spectral features maps, utilizing pixel-wise concatenation and spatial attention mechanism within a dense strategy to incorporate low-rank frequency features and guide the feature tuning. Experimental results on benchmark datasets such as Pavia University, Houston 2018, and Indian Pines demonstrate the superiority of our method compared to other HSI classification methods, including state-of-the-art techniques. The proposed method achieved 97.93% overall accuracy on the Houston-2018, 99.89% on Pavia University, and 99.38% on the Indian Pines dataset with the 25 × 25 window size.

Save Icon
Up Arrow
Open/Close
Setting-up Chat
Loading Interface