Light-Weight Self-Supervised Contrastive Learning Network For Small Sample Hyperspectral Image Classification
In recent years, the rapid development of deep learning has greatly improved the performance of hyperspectral image (HSI) classification. However, due to the high cost of obtaining high-quality HSI data and the large amount of manpower and material resources required for labeling, the small sample problem has always been a challenge. To solve it, a spatial-spectral data augmentation method is designed to expand the limited data set and generate new training samples. Simultaneously, a light-weight self-supervised contrastive learning network is conducted to extract spatial-spectral feature and focus on the relative relationship between samples to improve the sample discrimination. Experimental results on 2 public HSI data sets demonstrate that the proposal could achieve better performance than some state-of-the-art contrastive learning methods facing the challenge of small sample.
- Research Article
11
- 10.1155/2021/1759111
- Jan 1, 2021
- Computational Intelligence and Neuroscience
As one of the fast evolution of remote sensing and spectral imagery techniques, hyperspectral image (HSI) classification has attracted considerable attention in various fields, including land survey, resource monitoring, and among others. Nonetheless, due to a lack of distinctiveness in the hyperspectral pixels of separate classes, there is a recurrent inseparability obstacle in the primary space. Additionally, an open challenge stems from examining efficient techniques that can speedily classify and interpret the spectral-spatial data bands within a more precise computational time. Hence, in this work, we propose a 3D-2D convolutional neural network and transfer learning model where the early layers of the model exploit 3D convolutions to modeling spectral-spatial information. On top of it are 2D convolutional layers to handle semantic abstraction mainly. Toward simplicity and a highly modularized network for image classification, we leverage the ResNeXt-50 block for our model. Furthermore, improving the separability among classes and balance of the interclass and intraclass criteria, we engaged principal component analysis (PCA) for the best orthogonal vectors for representing information from HSIs before feeding to the network. The experimental result shows that our model can efficiently improve the hyperspectral imagery classification, including an instantaneous representation of the spectral-spatial information. Our model evaluation on five publicly available hyperspectral datasets, Indian Pines (IP), Pavia University Scene (PU), Salinas Scene (SA), Botswana (BS), and Kennedy Space Center (KSC), was performed with a high classification accuracy of 99.85%, 99.98%, 100%, 99.82%, and 99.71%, respectively. Quantitative results demonstrated that it outperformed several state-of-the-arts (SOTA), deep neural network-based approaches, and standard classifiers. Thus, it has provided more insight into hyperspectral image classification.
- Research Article
2
- 10.3934/mbe.2024138
- Jan 1, 2024
- Mathematical Biosciences and Engineering
Carotid plaque classification from ultrasound images is crucial for predicting ischemic stroke risk. While deep learning has shown effectiveness, it heavily relies on substantial labeled datasets. Achieving high performance with limited labeled images is essential for clinical use. Self-supervised learning (SSL) offers a potential solution; however, the existing works mainly focus on constructing the SSL tasks, neglecting the use of multiple tasks for pretraining. To overcome these limitations, this study proposed a self-supervised fusion network (Fusion-SSL) for carotid plaque ultrasound image classification with limited labeled data. Fusion-SSL consists of two SSL tasks: classifying image block order (Ordering) and predicting image rotation angle (Rotating). A dual-branch residual neural network was developed to fuse feature presentations learned by the two tasks, which can extract richer visual boundary shape and contour information than a single task. In this experiment, 1270 carotid plaque ultrasound images were collected from 844 patients at Zhongnan Hospital (Wuhan, China). The results showed that Fusion-SSL outperforms single SSL methods across different percentages of labeled training data, ranging from 10 to 100%. Moreover, with only 40% labeled training data, Fusion-SSL achieved comparable results to a single SSL method (predicting image rotation angle) with 100% labeled data. These results indicate that Fusion-SSL could be beneficial for the classification of carotid plaques and the early warning of a stroke in clinical practice.
- Research Article
1
- 10.54254/2755-2721/81/20241009
- Nov 8, 2024
- Applied and Computational Engineering
Abstract. This paper reviews the application and improvement of convolutional neural networks (CNNs) in image classification. Firstly, a shallow CNN for interstitial lung disease image classification is presented. This model suppresses overfitting through a unique network architecture and optimisation algorithm. Next, the improved VGG16 architecture and MIDNet18 model are discussed and their superior performance in brain tumour image classification is demonstrated. Subsequently, a CNN-CapsNet model for cervical cancer image classification and its improvement are presented and the customised model is compared with the conventional VGG-16 CNN architecture in the paper. Next, the application of sparse convolutional kernels and hybrid sparse convolutional kernels (HDCs) in solving the problem of computational resource consumption is presented. Subsequently, methods for solving the problem of limited training data through transfer learning and network data augmentation techniques are discussed, as well as GAN-generated datasets for solving the overfitting problem. Finally, the effect of degraded images on the classification effectiveness of CNNs is explored. The results show that the improved CNN architecture and algorithms have significant effects in solving the problems of overfitting and computational resource consumption, and can significantly improve the accuracy and efficiency of image classification. And degraded images do adversely affect the accuracy of CNN for image classification.
- Research Article
2
- 10.1080/17538947.2025.2520480
- Aug 1, 2025
- International Journal of Digital Earth
Deep learning models have obtained great success in hyperspectral image classification tasks. Nevertheless, they are usually vulnerable to adversarial attacks. Some existing works have been made to defend against adversarial attacks in HSI classification. These works primarily focus on lots of adversarial samples and spatial relationships while overlooking the strong long-range dependencies from HSI. To alleviate this problem, we propose a novel spectral spatial mamba adversarial defense network (SSMADNet) for hyperspectral adversarial image classification. It includes a dense involution branch, a spectral mamba branch, and a spatial multiscale mamba branch. The dense involution branch extracts embedding features via three dense involution layers. The spectral mamba branch can learn the spectral sequence information from HSI adversarial samples. The spatial multiscale mamba branch can model the long-range interaction of the whole image. Finally, a spectral spatial feature enhancement module is designed to adaptively enhance useful spectral spatial features of HSI. Extensive experimental results demonstrate that on five HSI adversarial datasets, the proposed SSMADNet achieves higher classification accuracies than state-of-the-art adversarial defense methods. In particular, our method obtains best OA (93.80%) on the Botswana adversarial data, which is much higher than the suboptimal method (OA = 90.30%).
- Research Article
11
- 10.1080/2150704x.2019.1686780
- Nov 12, 2019
- Remote Sensing Letters
ABSTRACTRecent research shows that deep learning-based methods can achieve promissing performance when applied to hyperspectral image (HSI) classification in remote sensing, some challenging issues still exist. For example, after a number of 2D convolutions, each feature map may only correspond to a unique dimension of the hyperspectral image. As a result, the relationship between different feature maps from multiple dimensional hyperspectral image can not be extracted well. Another issue is information in extracted feature maps may be erased by pooling operations. To address these problems, we propose a novel hybrid neural network (HNN) for hyperspectral image classification. The HNN uses a multi-branch architecture to extract hyperspectral image features in order to improve its prediction accuracy. Moreover, we build a deconvolution structure to recover the lost information in the pooling operation. In addition, to improve convergence and prevent overfitting, the HNN applies batch normalization (BN) and parametric rectified linear units (PReLU). In the experiments, two public benchmark HSIs are utilized to evaluate the performance of the proposed method. The experimental results demonstrate the superiority of HNN over several well-known methods.
- Research Article
54
- 10.1109/tgrs.2021.3131152
- Jan 1, 2022
- IEEE Transactions on Geoscience and Remote Sensing
Unsupervised and semisupervised feature learning has recently emerged as an effective way to reduce the reliance on expensive data collection and annotation for hyperspectral image (HSI) analysis. Existing unsupervised and semisupervised convolutional neural network (CNN)-based HSI classification works still face two challenges: underutilization of pixel-wise multiscale contextual information for feature learning and expensive computational cost, for example, large floating-point operations per seconds (FLOPs), due to the lack of lightweight design. To utilize the unlabeled pixels in the HSIs more efficiently, we propose a self-supervised contrastive efficient asymmetric dilated network (SC-EADNet) for HSI classification. There are two novelties in the SC-EADNet. First, a self-supervised multiscale pixel-wise contextual feature learning model is proposed, which generates multiple patches around each hyperspectral pixel and develops a contrastive learning framework to learn from these patches for HSI classification. Second, a lightweight feature extraction network EADNet, composed of multiple plug-and-play efficient asymmetric dilated convolution (EADC) blocks, is designed and inserted into the contrastive learning framework. The EADC block adopts different dilation rates to capture the spatial information of objects with varying shapes and sizes. Compared with other unsupervised, semisupervised, and supervised learning methods, our SC-EADNet provides competitive classification performance on four hyperspectral datasets, including Indian Pines, Pavia University, Salinas, and Houston 2013, but few FLOPs and fast computational speed.
- Research Article
57
- 10.1016/j.neucom.2022.02.003
- Feb 8, 2022
- Neurocomputing
EvoDCNN: An evolutionary deep convolutional neural network for image classification
- Research Article
50
- 10.1016/j.engappai.2023.107280
- Oct 12, 2023
- Engineering Applications of Artificial Intelligence
Fuzzy graph convolutional network for hyperspectral image classification
- Research Article
6
- 10.1016/j.jiixd.2024.06.001
- Jun 20, 2024
- Journal of Information and Intelligence
Although the deep-learning method has achieved great success for hyperspectral image (HSI) classification, the few-shot HSI classification deserves sufficient study because it is difficult and expensive to acquire labeled samples. In fact, the meta-learning methods can improve the performance for few-shot HSI classification effectively. However, most of the existing meta-learning methods for HSI classification are supervised, which still heavily rely on the labeled data for meta-training. Moreover, there are many cross-scene classification tasks in the real world, and domain adaptation of unsupervised meta-learning has been ignored for HSI classification so far. To address the above issues, this paper proposes an unsupervised meta-learning method with domain adaptation based on a multi-task reconstruction-classification network (MRCN) for few-shot HSI Classification. MRCN does not need any labeled data for meta-training, where the pseudo labels are generated by multiple spectral random sampling and data augmentation. The meta-training of MRCN jointly learns a shared encoding representation for two tasks and domains. On the one hand, we design an encoder-classifier to learn the classification task on the source-domain data. On the other hand, we devise an encoder-decoder to learn the reconstruction task on the target-domain data. The experimental results on four HSI datasets demonstrate that MRCN preforms better than several state-of-the-art methods with only two to five labeled samples per class. To the best of our knowledge, the proposed method is the first unsupervised meta-learning method that considers the domain adaptation for few-shot HSI classification.
- Conference Article
2
- 10.1109/icipmc55686.2022.00027
- May 1, 2022
Recently, deep learning methods using the attention mechanism have generated considerable research interest for hyperspectral image classification. In many existing attention-based methods, global pooling is widely used to obtaining the attention weights. In general, there are multiple categories in a hyperspectral image, so the operation of global pooling is too crude and inappropriate. To alleviate this problem, we propose a coarse-refined local attention network (CRLAN) for hyperspectral image classification. CRLAN is composed of two stages of fully convolutional networks. The first stage employs a coarse local attention fully convolutional network for hyperspectral image classification. In this stage, local parameters are roughly estimated according to the original size of the hyperspectral image. In the second stage, the prediction classification probability of the first stage network is applied to obtain the refined local attention features. Finally, for testing convenience, these two stages are integrated into an end-to-end network. Experimental results on two public data sets demonstrate that CRLAN is effective in improving classification performance.
- Research Article
42
- 10.1016/j.jag.2024.104092
- Aug 13, 2024
- International Journal of Applied Earth Observation and Geoinformation
A local enhanced mamba network for hyperspectral image classification
- Research Article
- 10.1080/10095020.2026.2627100
- Mar 1, 2026
- Geo-spatial Information Science
In recent years, Convolutional Neural Networks (CNNs) and Vision Transformers (ViTs) have achieved significant progress in Hyperspectral Image (HSI) classification. However, in practical applications, the high cost of sample annotation and the limited availability of training samples lead to overfitting in CNNs and ViTs under few-shot learning scenarios. Siamese networks, as an effective metric learning method, show promising performance in few-shot learning due to their low dependency on sample information. However, traditional siamese networks rely on static parameter-sharing mechanisms, lack feature interaction between the two subnetworks, and struggle to effectively capture the spatial-spectral heterogeneity in hyperspectral data. Additionally, they are prone to noise interference, resulting in insufficient discriminative power of key features. To address these challenges, this paper proposes a Contextual Interaction Siamese Network for Few-Shot Hyperspectral Image Classification (CISNet). First, an Interactive Feature Fusion Module (IFFM) is introduced to capture the similarities and differences between features from the two subnetworks, thereby enhancing the discriminative power of key features. Second, an Enhanced Token Generation Module (ETGM) is designed to generate correlated class tokens for the two subnetworks. Finally, this paper innovatively proposes a Context Interaction Transformer Block (CITB) and a Guided Attention (GA) mechanism to strengthen global context interaction between the two subnetworks. Extensive experiments demonstrate that CISNet achieves superior performance under few-shot conditions and outperforms other state-of-the-art methods in classification accuracy.
- Research Article
35
- 10.1109/tgrs.2022.3180685
- Jan 1, 2022
- IEEE Transactions on Geoscience and Remote Sensing
Hyperspectral image (HSI) classification has been a hot topic for decides, as hyperspectral images have rich spatial and spectral information and provide strong basis for distinguishing different land-cover objects. Benefiting from the development of deep learning technologies, deep learning based HSI classification methods have achieved promising performance. Recently, several neural architecture search (NAS) algorithms have been proposed for HSI classification, which further improve the accuracy of HSI classification to a new level. In this paper, NAS and Transformer are combined for handling HSI classification task for the first time. Compared with previous work, the proposed method has two main differences. First, we revisit the search spaces designed in previous HSI classification NAS methods and propose a novel hybrid search space, consisting of the space dominated cell and the spectrum dominated cell. Compared with search spaces proposed in previous works, the proposed hybrid search space is more aligned with the characteristic of HSI data, that is, HSIs have a relatively low spatial resolution and an extremely high spectral resolution. Second, to further improve the classification accuracy, we attempt to graft the emerging transformer module on the automatically designed convolutional neural network (CNN) to add global information to local region focused features learned by CNN. Experimental results on three public HSI datasets show that the proposed method achieves much better performance than comparison approaches, including manually designed network and NAS based HSI classification methods. Especially on the most recently captured dataset Houston University, overall accuracy is improved by nearly 6 percentage points. Code is available at: https://github.com/Cecilia-xue/HyT-NAS.
- Research Article
20
- 10.1080/01431161.2023.2249598
- Sep 8, 2023
- International Journal of Remote Sensing
Convolutional Neural Network (CNN) has developed hyperspectral image (HSI) classification effectively. Although many CNN-based models can extract local features in HSI, it is difficult for them to extract global features. With its ability to capture long-range dependencies, Transformer is gradually gaining prominence in HSI classification, but it may overlook some local details when extracting features. To address these issues, we proposed a CNN and transformer interaction network (CTIN) for HSI classification. Firstly, A dual-branch structure was constructed in which CNN and Transformer are arranged in parallel to simultaneously extract global features and local features in HSI. Secondly, a feature interaction module has been imported between the two branches, thus facilitating a bi-directional flow of information between the global and local feature spaces. In this way, the network structure combines the advantages of CNN and Transformer in extracting features as much as possible. In addition, a token generation method is designed to harness abundant contextual information that is relevant to the centre pixel, and improve the accuracy of the final classification. Experiments were conducted on four hyperspectral datasets (two classical datasets – Indian Pines, Salinas Valley, a new satellite dataset – Yellow River, and an self-made UAV dataset-Yellow River Willow). Experimental results show that the proposed method outperforms the other state-of-the-art methods, with overall accuracies of 99.21%, 99.61%, 92.40%, and 98.17%, respectively.
- Research Article
38
- 10.3390/rs12122035
- Jun 24, 2020
- Remote Sensing
Recently, deep learning methods based on three-dimensional (3-D) convolution have been widely used in the hyperspectral image (HSI) classification tasks and shown good classification performance. However, affected by the irregular distribution of various classes in HSI datasets, most previous 3-D convolutional neural network (CNN)-based models require more training samples to obtain better classification accuracies. In addition, as the network deepens, which leads to the spatial resolution of feature maps gradually decreasing, much useful information may be lost during the training process. Therefore, how to ensure efficient network training is key to the HSI classification tasks. To address the issue mentioned above, in this paper, we proposed a 3-DCNN-based residual group channel and space attention network (RGCSA) for HSI classification. Firstly, the proposed bottom-up top-down attention structure with the residual connection can improve network training efficiency by optimizing channel-wise and spatial-wise features throughout the whole training process. Secondly, the proposed residual group channel-wise attention module can reduce the possibility of losing useful information, and the novel spatial-wise attention module can extract context information to strengthen the spatial features. Furthermore, our proposed RGCSA network only needs few training samples to achieve higher classification accuracies than previous 3-D-CNN-based networks. The experimental results on three commonly used HSI datasets demonstrate the superiority of our proposed network based on the attention mechanism and the effectiveness of the proposed channel-wise and spatial-wise attention modules for HSI classification. The code and configurations are released at Github.com.