Heterogeneous Few-shot Learning with Knowledge Distillation for Hyperspectral Image Classification
Hyperspectral image (HSI) classification is one of the most popular applications in remote sensing. In practice, due to the high cost of manual labeling, only a few hyperspectral image samples with labels can be obtained. A small number of labeled training samples tend to overfit the deep network method, resulting in a sharp decline in classification accuracy. In order to solve this problem, this paper proposes a classification method for hyperspectral images based on knowledge distillation and heterogeneous few-shot learning. Firstly, the model pretrain the feature extraction network on miniImageNet, a small sample natural image dataset with abundant labeled images, and introduces knowledge distillation to improve the feature expression capability of shallow network in small sample. Then, effective knowledge transfer is carried out between two heterogeneous data sets, and the weights obtained from the model on the natural data set are transferred to the backbone network of hyperspectral image classification to improve the accuracy of HSI classification. Finally, the classifier is fine-tuned on HSI using the paradigm of small sample learning to extract discriminative hyperspectral image features and further enhance the model's detail expression. Experimental results on two hyperspectral image classification datasets show that the proposed method can effectively improve the accuracy of small sample hyperspectral image classification.
- Research Article
746
- 10.1109/lgrs.2010.2047711
- Oct 1, 2010
- IEEE Geoscience and Remote Sensing Letters
The high number of spectral bands acquired by hyperspectral sensors increases the capability to distinguish physical materials and objects, presenting new challenges to image analysis and classification. This letter presents a novel method for accurate spectral-spatial classification of hyperspectral images. The proposed technique consists of two steps. In the first step, a probabilistic support vector machine pixelwise classification of the hyperspectral image is applied. In the second step, spatial contextual information is used for refining the classification results obtained in the first step. This is achieved by means of a Markov random field regularization. Experimental results are presented for three hyperspectral airborne images and compared with those obtained by recently proposed advanced spectral-spatial classification techniques. The proposed method improves classification accuracies when compared to other classification approaches.
- Research Article
183
- 10.1109/tgrs.2019.2951445
- Dec 5, 2019
- IEEE Transactions on Geoscience and Remote Sensing
Deep convolutional neural networks (CNNs) have shown their outstanding performance in the hyperspectral image (HSI) classification. The success of CNN-based HSI classification relies on the availability sufficient training samples. However, the collection of training samples is expensive and time consuming. Besides, there are many pretrained models on large-scale data sets, which extract the general and discriminative features. The proper reusage of low-level and midlevel representations will significantly improve the HSI classification accuracy. The large-scale ImageNet data set has three channels, but HSI contains hundreds of channels. Therefore, there are several difficulties to simply adapt the pretrained models for the classification of HSIs. In this article, heterogeneous transfer learning for HSI classification is proposed. First, a mapping layer is used to handle the issue of having different numbers of channels. Then, the model architectures and weights of the CNN trained on the ImageNet data sets are used to initialize the model and weights of the HSI classification network. Finally, a well-designed neural network is used to perform the HSI classification task. Furthermore, attention mechanism is used to adjust the feature maps due to the difference between the heterogeneous data sets. Moreover, controlled random sampling is used as another training sample selection method to test the effectiveness of the proposed methods. Experimental results on four popular hyperspectral data sets with two training sample selection strategies show that the transferred CNN obtains better classification accuracy than that of state-of-the-art methods. In addition, the idea of heterogeneous transfer learning may open a new window for further research.
- Conference Article
6
- 10.1117/12.2573715
- Oct 14, 2020
Classification of hyperspectral images is an important step of hyperspectral image interpretation. Different studies demonstrate that spatial features can provide complementary information for increasing the accuracy of hyperspectral image classification. In this study, we propose a method of spectral-spatial classification of hyperspectral images that is based on the use of specific multifractal features as the spatial features. The proposed method of hyperspectral image classification consists of the following steps. First, informative multifractal features are extracted from first few principal components of spectral features. For construction of the multifractal features, in the windows centered on each element of principal component images, using a generalized local-global multifractal image analysis, various 1D and 2D multiracial characteristics can be calculated including our early introduced 2D multifractal characteristics of global scaling exponents. After that, obtained multifractal features are stacked with spectral features into high-dimensional feature vectors. Finally, the resulting high-dimensional vectors of spectral and multifractal features are classified by a support vector machine classifier. The multifractal characteristics that are used to construct multifractal features have a lot of advantages: these characteristics provide a good textural separability of image objects, demonstrate an invariance to image scaling and rotation, and they are also insensitive to image noise. The experiments performed on several widely known test hyperspectral images have demonstrated that proposed method exhibits better performance than competitive methods of spectral-spatial classification of hyperspectral images, in terms of the overall accuracy and kappa statistic. In addition, it is shown that the introduced classification method can outperform some deep learning methods of hyperspectral image classification, which in recent years have attracted great interest in hyperspectral image classification. In particular, it was established that the proposed method can achieve good classification results over deep learning methods if we use small training samples for classification. In the future, we will focus on developing methods for object-oriented classification of hyperspectral images, which are based on the use of multifractal features. The study has been supported by the Ministry of Education and Science of the Russian Federation (Project No. МК-3477.2019.5) and by the Russian Foundation for Basic Research (Project No. 19-05-00330 А).
- Research Article
69
- 10.1109/lgrs.2021.3117577
- Jan 1, 2022
- IEEE Geoscience and Remote Sensing Letters
Deep learning has achieved great success in hyperspectral image (HSI) classification. However, its success relies on the availability of sufficient training samples. Unfortunately, the collection of training samples is expensive, time-consuming, and even impossible in some cases. Natural image datasets that are different from HSI, such as Image Net and mini-ImageNet, have abundant texture and structure information. Effective knowledge transfer between two heterogeneous datasets can significantly improve the accuracy of HSI classification. In this letter, heterogeneous few-shot learning (HFSL) for HSI classification is proposed with only a few labeled samples per class. First, few-shot learning is performed on the mini-ImageNet datasets to learn the transferable knowledge. Then, to make full use of the spatial and spectral information, a spectral–spatial fusion network is devised. Spectral information is obtained by the residual network with pure 1-D operators. Spatial information is extracted by a convolution network with pure 2-D operators, and the weights of the spatial network are initialized by the weights of the model trained on the mini-ImageNet datasets. Finally, few-shot learning is fine-tuned on HSI to extract discriminative spectral–spatial features and individual knowledge, which can improve the classification performance of the new classification task. Experiments conducted on two public HSI datasets demonstrate that the HFSL outperforms the existing few-shot learning methods and supervised learning methods for HSI classification with only a few labeled samples. Our source code is available at <uri>https://github.com/Li-ZK/HFSL</uri>.
- Research Article
156
- 10.1109/jstars.2021.3059451
- Jan 1, 2021
- IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing
To improve the accuracy and generalization ability of hyperspectral image classification, a feature extraction method integrating principal component analysis (PCA) and local binary pattern (LBP) is developed for hyperspectral images in this article. The PCA is employed to reduce the dimension of the spectral features of hyperspectral images. The LBP with low computational complexity is used to extract the local spatial texture features of hyperspectral images to construct multifeature vectors. Then, the gray wolf optimization algorithm with global search capability is employed to optimize the parameters of kernel extreme learning machine (KELM) to construct an optimized KELM model, which is used to effectively realize a hyperspectral image classification (PLG-KELM) method. Finally, the Indian pines dataset, Houston dataset, and Pavia University dataset and an application of WHU-Hi-LongKou dataset are selected to verify the effectiveness of the PLG-KELM. The comparison experiment results show that the PLG-KELM can obtain higher classification accuracy, and takes on better generalization ability for small samples. It provides a new idea for processing hyperspectral images.
- Book Chapter
5
- 10.1007/978-3-030-31723-2_66
- Jan 1, 2019
In recent studies, superpixel segmentation has been integrated into hyperspectral (HS) image classification methods. However, the existing superpixel-based classification methods usually suffer from two serious problems. First, the accuracy and efficiency of current superpixel segmentation approaches cannot meet the demands of practical applications for HS images; second, conventional superpixel-based classification methods generally consider each generated superpixel as a unit for the image classification, which may help to reduce the computing time but result in a significant decrease of the classification accuracy. To solve the problems, we propose a fast region growing based superpixel segmentation (FRGSS) algorithm and a novel texture-adaptive superpixel integration strategy (TASIS) for the HS image classification. Experimental results on real Airborne Visible/Infrared Imaging Spectrometer (AVIRIS) HS images demonstrate that the proposed FRGSS outperforms the state-of-the-art superpixel algorithm. In addition, the superiority of the TASIS is verified compared to the pixel-wise and the conventional superpixel-based classification methods.
- Research Article
8
- 10.1080/01431161.2022.2048916
- Mar 4, 2022
- International Journal of Remote Sensing
Existing graph-based, semi-supervised hyperspectral image (HSI) classification models often suffer from prolonged execution time due to high computational complexity. In this work, we first propose a fast anchor graph regularization (FAGR) model for large scale, HSI classification. FAGR employs a simple anchor-based graph construction procedure and a new adjacency matrix among anchors to dramatically reduce the computational complexity while attaining good classification performance. In order to further improve the classification accuracy of hyperspectral images, we propose a novel semi-supervised anchor graph ensemble (SAGE) model. SAGE is an ensemble realization of multiple FAGR with each component FAGR operating on a randomly selected subset of features. Ameta classifier is applied to aggregate the outputs of component classifiers to yield an ensemble classification result. We performed extensive experimentations using three real-world HSI datasets, to compare the performance of FAGR and SAGE against several existing graph-based HSI classifiers. The experiment results show that the proposed SAGE achieves 95.78% classification accuracy on the Indian Pines dataset using limited labeled samples, out-performing existing models in terms of shorter execution time and better classification accuracy.
- Research Article
24
- 10.3390/rs12050779
- Feb 29, 2020
- Remote Sensing
Recently, Hyperspectral Image (HSI) classification methods based on deep learning models have shown encouraging performance. However, the limited numbers of training samples, as well as the mixed pixels due to low spatial resolution, have become major obstacles for HSI classification. To tackle these problems, we propose a resource-efficient HSI classification framework which introduces adaptive spectral unmixing into a 3D/2D dense network with early-exiting strategy. More specifically, on one hand, our framework uses a cascade of intermediate classifiers throughout the 3D/2D dense network that is trained end-to-end. The proposed 3D/2D dense network that integrates 3D convolutions with 2D convolutions is more capable of handling spectral-spatial features, while containing fewer parameters compared with the conventional 3D convolutions, and further boosts the network performance with limited training samples. On another hand, considering the existence of mixed pixels in HSI data, the pixels in HSI classification are divided into hard samples and easy samples. With the early-exiting strategy in these intermediate classifiers, the average accuracy can be improved by reducing the amount of computation cost for easy samples, thus focusing on classifying hard samples. Furthermore, for hard samples, an adaptive spectral unmixing method is proposed as a complementary source of information for classification, which brings considerable benefits to the final performance. Experimental results on four HSI benchmark datasets demonstrate that the proposed method can achieve better performance than state-of-the-art deep learning-based methods and other traditional HSI classification methods.
- Research Article
14
- 10.3390/electronics11162540
- Aug 13, 2022
- Electronics
Deep learning has achieved significant success in the field of hyperspectral image (HSI) classification, but challenges are still faced when the number of training samples is small. Feature fusing approaches based on multi-channel and multi-scale feature extractions are attractive for HSI classification where few samples are available. In this paper, based on feature fusion, we proposed a simple yet effective CNN-based Dual-channel Spectral Enhancement Network (DSEN) to fully exploit the features of the small labeled HSI samples for HSI classification. We worked with the observation that, in many HSI classification models, most of the incorrectly classified pixels of HSI are at the border of different classes, which is caused by feature obfuscation. Hence, in DSEN, we specially designed a spectral feature extraction channel to enhance the spectral feature representation of the specific pixel. Moreover, a spatial–spectral channel was designed using small convolution kernels to extract the spatial–spectral features of HSI. By adjusting the fusion proportion of the features extracted from the two channels, the expression of spectral features was enhanced in terms of the fused features for better HSI classification. The experimental results demonstrated that the overall accuracy (OA) of HSI classification using the proposed DSEN reached 69.47%, 80.54%, and 93.24% when only five training samples for each class were selected from the Indian Pines (IP), University of Pavia (UP), and Salinas Scene (SA) datasets, respectively. The performance improved when the number of training samples increased. Compared with several related methods, DSEN demonstrated superior performance in HSI classification.
- Research Article
10
- 10.1109/jstars.2021.3123371
- Jan 1, 2021
- IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing
Over the past few years, convolutional neural network (CNN) has been broadly adopted in remote sensing (RS) imagery processing areas due to its impressive capabilities in feature extraction. Nevertheless, it is still a challenge for CNN-based hyperspectral image (HSI) classification methods to extract more effective spectral-spatial features considering all spectral bands. Driven by this issue, we propose a novel approach to cope with the HSI classification task, referring to the multi-level joint feature extraction network (MJFEN). The proposed network makes full use of the information on each channel of HSI and transforms it into valid channel-wised spatial features through a designed convolution process. Moreover, these feature maps form global attention details to guide the extraction of spectral-spatial features, which are taken to the next level for further feature mining. Then, the features obtained at different levels are integrated for ground object classification. In contrast with several state-of-the-art HSI classification methods on four public datasets, experimental results demonstrate the effectiveness and remarkable feature extraction capability of our proposed approach.
- Conference Article
1
- 10.1145/3641584.3641609
- Sep 22, 2023
With the continuous innovation in deep learning, it has become a major direction for scholars to introduce the knowledge of deep learning into hyperspectral image classification to enhance its classification accuracy. Convolutional Neural Networks (CNN) are one of the most commonly used deep learning-based visual data processing methods, and are widely used in hyperspectral image (HSI) classification by virtue of their excellent contextual modeling capability. Since the performance of HSI classification is highly dependent on spatial and spectral information, this paper proposes a hyperspectral image classification method using 3D attention mechanism in collaboration with Transformer for hyperspectral image classification in view of the problems that the current hyperspectral image classification models with the framework of CNN have insufficient spatial spectral feature extraction and fail to excavate and represent the sequence properties of spectral features well. In this paper, we introduce a variant Transformer model based on a hybrid model of both improved 3D-CNN and 2D-CNN, combining complementary information of spatial spectrum and spectra in the form of 3D convolution and 2D convolution on CNN, and adding a variant attention mechanism module to strengthen spatial texture features, while combining grouped transfer Transformer to jump connection to enable the lower layer to better learn the upper layer features. Firstly, a variant channel attention mechanism is introduced on 3D-CNN to enhance the acquisition of spectral information of image features by 3D-CNN. Secondly, a variant spatial attention mechanism is introduced to enable 3D-CNN to better acquire the spatial information of hyperspectral images in the network, and subsequently the acquired spatial and spectral feature information is passed to 2D-CNN to enable it to better acquire local feature information. Finally, the acquired image feature information is passed to the variant Transformer model to make up for the fact that CNN can only acquire hyperspectral image features in local contexts, enabling it to better acquire global feature information on feature sequences. The experimental results show that the proposed model is experimented on two hyperspectral datasets, Indian Pines and Pavia University, and the overall classification accuracy (OA), average classification accuracy (AA), and Kappa coefficient reach up to 99.59%, 99.31%, and 99.45%, respectively, on the PU dataset, compared with the current cutting-edge techniques. The classification accuracy has been improved.
- Research Article
40
- 10.3390/s21051751
- Mar 3, 2021
- Sensors
Hyperspectral image (HSI) classification is the subject of intense research in remote sensing. The tremendous success of deep learning in computer vision has recently sparked the interest in applying deep learning in hyperspectral image classification. However, most deep learning methods for hyperspectral image classification are based on convolutional neural networks (CNN). Those methods require heavy GPU memory resources and run time. Recently, another deep learning model, the transformer, has been applied for image recognition, and the study result demonstrates the great potential of the transformer network for computer vision tasks. In this paper, we propose a model for hyperspectral image classification based on the transformer, which is widely used in natural language processing. Besides, we believe we are the first to combine the metric learning and the transformer model in hyperspectral image classification. Moreover, to improve the model classification performance when the available training samples are limited, we use the 1-D convolution and Mish activation function. The experimental results on three widely used hyperspectral image data sets demonstrate the proposed model’s advantages in accuracy, GPU memory cost, and running time.
- Research Article
26
- 10.1016/j.sigpro.2023.109202
- Jul 28, 2023
- Signal Processing
Hyperspectral remote sensing image classification based on residual generative Adversarial Neural Networks
- Research Article
36
- 10.3390/rs11050534
- Mar 5, 2019
- Remote Sensing
Spectral features cannot effectively reflect the differences among the ground objects and distinguish their boundaries in hyperspectral image (HSI) classification. Multi-scale feature extraction can solve this problem and improve the accuracy of HSI classification. The Gaussian pyramid can effectively decompose HSI into multi-scale structures, and efficiently extract features of different scales by stepwise filtering and downsampling. Therefore, this paper proposed a Gaussian pyramid based multi-scale feature extraction (MSFE) classification method for HSI. First, the HSI is decomposed into several Gaussian pyramids to extract multi-scale features. Second, we construct probability maps in each layer of the Gaussian pyramid and employ edge-preserving filtering (EPF) algorithms to further optimize the details. Finally, the final classification map is acquired by a majority voting method. Compared with other spectral-spatial classification methods, the proposed method can not only extract the characteristics of different scales, but also can better preserve detailed structures and the edge regions of the image. Experiments performed on three real hyperspectral datasets show that the proposed method can achieve competitive classification accuracy.
- Research Article
- 10.1109/tgrs.2025.3618636
- Jan 1, 2025
- IEEE Transactions on Geoscience and Remote Sensing
Most hyperspectral image (HSI) classification methods assume that all classes in the test set are present during training. However, in real-world applications, acquiring labeled training samples is challenging. As a result, it is difficult for the training dataset to cover all possible land cover types, leading to the generalized zero-shot learning (GZSL) problem. Recently, vision-language models (VLMs) have provided rich semantic priors for land cover classes, offering promising potential for GZSL. However, two fundamental gaps hinder their application to HSI classification: the task paradigm gap, arising from the difference between image-level VLMs and the pixel-level HSI classification task; and the knowledge gap, due to the inconsistency between VLM features and HSI spectral–spatial representations. To bridge both gaps, a novel framework leveraging VLM semantic priors for GZSL in HSI classification is proposed, primarily using pseudo-labeling technique to provide knowledge for unseen classes. Specifically, a pseudo-label generation and enhancement module enables a paradigm transition from image-level understanding to pixel-level classification by incorporating HSI’s spatial information. A pseudo-label correction module then refines noisy labels using spectral cues to address the knowledge gap. Finally, a global learning strategy integrates pseudo-label distillation, supervised learning, and feature regularization to classify seen classes while enabling generalization to unseen ones. Experiments on benchmark HSI datasets demonstrate the proposed method’s superiority in generalized zero-shot classification. This work highlights the potential of VLMs in advancing HSI classification in practical applications.