MCDB-Net: Multiview Collaborative Dual-Branch Unmixing Network for Hyperspectral Images
Hyperspectral unmixing based on autoencoders is a crucial research task in remote sensing imagery. While existing deep hyperspectral unmixing networks primarily focus on spatial features, the inherently rich and continuous spectral bands in hyperspectral images harbor significant underutilized information. The intrinsic characteristics of these continuous bands can naturally enable more nuanced and effective modeling of mixed pixels. In this paper, we propose the multiview collaborative dual-branch network (MCDB-Net), designed to fully learn and exploit the complex spectral features in high-dimensional hyperspectral data, thereby enhancing the representation capabilities of these features. MCDB-Net constructs a novel multiview spectral block to strengthen the correlation between multiple views of pixel spectra. The full view spectral feature extraction module highlights important spectral features, while the local multiview spectral feature extraction module provides a detailed understanding of the interactions between multiview spectral information. The multiview abundance collaboration module collaboratively learns spectral feature information from different perspectives and dynamically adjusts the weights of abundance estimations, leading to better integration of abundance features across various views. Extensive experiments on different datasets demonstrate that MCDB-Net achieves higher continuity and robustness in unmixing results, showcasing its powerful capability in spectral feature extraction.
- Research Article
17
- 10.3390/rs15071803
- Mar 28, 2023
- Remote Sensing
Hyperspectral images (HSI) contain powerful spectral characterization capabilities and are widely used especially for classification applications. However, the rich spectrum contained in HSI also increases the difficulty of extracting useful information, which makes the feature extraction method significant as it enables effective expression and utilization of the spectrum. Traditional HSI feature extraction methods design spectral features manually, which is likely to be limited by the complex spectral information within HSI. Recently, data-driven methods, especially the use of convolutional neural networks (CNNs), have shown great improvements in performance when processing image data owing to their powerful automatic feature learning and extraction abilities and are also widely used for HSI feature extraction and classification. The CNN extracts features based on the convolution operation. Nevertheless, the local perception of the convolution operation makes CNN focus on the local spectral features (LSF) and weakens the description of features between long-distance spectral ranges, which will be referred to as global spectral features (GSF) in this study. LSF and GSF describe the spectral features from two different perspectives and are both essential for determining the spectrum. Thus, in this study, a local-global spectral feature (LGSF) extraction and optimization method is proposed to jointly consider the LSF and GSF for HSI classification. To increase the relationship between spectra and the possibility to obtain features with more forms, we first transformed the 1D spectral vector into a 2D spectral image. Based on the spectral image, the local spectral feature extraction module (LSFEM) and the global spectral feature extraction module (GSFEM) are proposed to automatically extract the LGSF. The loss function for spectral feature optimization is proposed to optimize the LGSF and obtain improved class separability inspired by contrastive learning. We further enhanced the LGSF by introducing spatial relation and designed a CNN constructed using dilated convolution for classification. The proposed method was evaluated on four widely used HSI datasets, and the results highlighted its comprehensive utilization of spectral information as well as its effectiveness in HSI classification.
- Preprint Article
- 10.21203/rs.3.rs-4787893/v1
- Aug 21, 2024
- Research Square
Hyperspectral images contain rich spatial and spectral information, which makes more and more researchers join the team of analyzing and studying them. Convolutional neural networks have been widely used in hyperspectral image classification, however, due to the high dimensionality and band correlation of the hyperspectral image data, the hyperspectral data contains a lot of redundant information, which not only adds to the arithmetic burden, but also affects the extraction of the global and local spectral and spatial features in the process of hyperspectral image classification. We design a hybrid convolutional model based on spatial and spectral channel reconstruction, which utilizes hybrid convolution to extract spatial and spectral features in hyperspectral images, and separates and reconstructs the spatial and spectral channels to suppress redundant features and reduce the computational load of the model, and introduces a global attention mechanism to enhance the global receptive field and learn the global spectral and spatial features. We conduct experiments on three widely used public datasets, IndianPines, PaviaU, and Houston 2013, and the overall accuracies reach 98.66%, 99.49%, and 99.07%, respectively, which validate the effectiveness of the model.
- Research Article
8
- 10.1109/lgrs.2023.3320193
- Jan 1, 2023
- IEEE Geoscience and Remote Sensing Letters
In recent years, numerous deep learning (DL)-based frameworks have been proposed for hyperspectral image classification (HSIC). Considering a large number of spectral bands of hyperspectral images (HSIs), it is still challenging to effectively utilize the spectral information and achieve accurate classification when few training samples are available. To make full use of the spectral-spatial information in HSIs with few training samples, in this letter we propose a lightweight end-to-end attention-enhanced feature fusion network (AeF <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">2</sup> N). The proposed AeF <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">2</sup> N consists of four sequential stages, i.e., spectral feature augmentation, spatial contextual feature interaction, spectral feature augmentation, and classification. The first and third stages are used to capture and augment the discriminative spectral features, while the second stage is used to capture spatial information. Notably, two novel attention blocks, spectral augmentation attention (SAA) and spatial integration attention (SIA) are interactively introduced to capture significant spectral and spatial information, respectively. Based on the proposed spectral and spatial feature discrimination stages, the AeF <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">2</sup> N effectively identifies both spectrally significant (e.g., irregular small objects) and spatially significant (e.g., specific-shaped objects) land objects with high accuracy. Experimental results obtained on three benchmark hyperspectral datasets demonstrate the superiority of the proposed approach compared with six state-of-the-art DL-based methods in terms of higher classification accuracy and efficiency.
- Research Article
4
- 10.3390/rs16173124
- Aug 24, 2024
- Remote Sensing
Hyperspectral images have the characteristics of high spectral resolution and low spatial resolution, which will make the extracted features insufficient and lack detailed information about ground objects, thus affecting the accuracy of classification. The numerous spectral bands of hyperspectral images contain rich spectral features but also bring issues of noise and redundancy. To improve the spatial resolution and fully extract spatial and spectral features, this article proposes an improved feature enhancement and extraction model (IFEE) using spatial feature enhancement and attention-guided bidirectional sequential spectral feature extraction for hyperspectral image classification. The adaptive guided filtering is introduced to highlight details and edge features in hyperspectral images. Then, an image enhancement module composed of two-dimensional convolutional neural networks is used to improve the resolution of the image after adaptive guidance filtering and provide a high-resolution image with key features emphasized for the subsequent feature extraction module. The proposed spectral attention mechanism helps to extract more representative spectral features, emphasizing useful information while suppressing the interference of noise. Experimental results show that our method outperforms other comparative methods even with very few training samples.
- Research Article
3
- 10.1016/j.ejrs.2024.11.001
- Mar 1, 2025
- The Egyptian Journal of Remote Sensing and Space Sciences
Spectral-Spatial Adaptive Weighted Fusion and Residual Dense Network for hyperspectral image classification
- Research Article
1
- 10.1371/journal.pone.0321559
- May 30, 2025
- PloS one
Deep learning has revolutionized the classification of land cover objects in hyperspectral images (HSIs), particularly by managing the complex 3D cube structure inherent in HSI data. Despite these advances, challenges such as data redundancy, computational costs, insufficient sample sizes, and the curse of dimensionality persist. Traditional 2D Convolutional Neural Networks (CNNs) struggle to fully leverage the interconnections between spectral bands in HSIs, while 3D CNNs, which capture both spatial and spectral features, require more sophisticated design. To address these issues, we propose a novel multilayered, multi-branched 2D-3D CNN model in this paper that integrates Segmented Principal Component Analysis (SPCA) and the minimum-Redundancy-Maximum-Relevance (mRMR) technique. This approach explores the local structure of the data and ranks features by significance. Our approach then hierarchically processes these features: the shallow branch handles the least significant features, the deep branch processes the most critical features, and the mid branch deals with the remaining features. Experimental results demonstrate that our proposed method outperforms most of the state-of-the-art techniques on the Salinas Scene, University of Pavia, and Indian Pines hyperspectral image datasets achieving 100%, 99.94%, and 99.12% Overall Accuracy respectively.
- Research Article
8
- 10.5194/isprs-archives-xliv-m-3-2021-187-2021
- Aug 10, 2021
- The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences
Abstract. Hyperspectral image classification (HSIC) is a challenging task in remote sensing data analysis, which has been applied in many domains for better identification and inspection of the earth surface by extracting spectral and spatial information. The combination of abundant spectral features and accurate spatial information can improve classification accuracy. However, many traditional methods are based on handcrafted features, which brings difficulties for multi-classification tasks due to spectral intra-class heterogeneity and similarity of inter-class. The deep learning algorithm, especially the convolutional neural network (CNN), has been perceived promising feature extractor and classification for processing hyperspectral remote sensing images. Although 2D CNN can extract spatial features, the specific spectral properties are not used effectively. While 3D CNN has the capability for them, but the computational burden increases as stacking layers. To address these issues, we propose a novel HSIC framework based on the residual CNN network by integrating the advantage of 2D and 3D CNN. First, 3D convolutions focus on extracting spectral features with feature recalibration and refinement by channel attention mechanism. The 2D depth-wise separable convolution approach with different size kernels concentrates on obtaining multi-scale spatial features and reducing model parameters. Furthermore, the residual structure optimizes the back-propagation for network training. The results and analysis of extensive HSIC experiments show that the proposed residual 2D-3D CNN network can effectively extract spectral and spatial features and improve classification accuracy.
- Research Article
- 10.3390/app152111738
- Nov 4, 2025
- Applied Sciences
Hyperspectral images (HSIs) are crucial for ground object classification, target detection, and related applications due to their rich spatial spectral information. However, hardware limitations in imaging systems make it challenging to directly acquire HSIs with a high spatial resolution. While deep learning-based single hyperspectral image super-resolution (SHSR) methods have made significant progress, existing approaches primarily rely on convolutional neural networks (CNNs) with fixed geometric kernels, which struggle to model global spatial spectral dependencies effectively. To address this, we propose ESSTformer, a novel SHSR framework that synergistically integrates CNNs’ local feature extraction and Transformers’ global modeling capabilities. Specifically, we design a multi-scale spectral attention module (MSAM) based on dilated convolutions to capture local multi-scale spatial spectral features. Considering the inherent differences between spatial and spectral information, we adopt a decoupled processing strategy by constructing separate spatial and Spectral Transformers. The Spatial Transformer employs window attention mechanisms and an improved convolutional multi-layer perceptron (CMLP) to model long-range spatial dependencies, while the Spectral Transformer utilizes self-attention mechanisms combined with a spectral enhancement module to focus on discriminative spectral features. Extensive experiments on three hyperspectral datasets demonstrate that the proposed ESSTformer achieves a superior performance in super-resolution reconstruction compared to state-of-the-art methods.
- Research Article
10
- 10.1016/j.ecolind.2024.111843
- Mar 1, 2024
- Ecological Indicators
Soil carbon content prediction based on hyperspectral images can achieve large-scale spatial measurement, which has the advantages of wide coverage and fast information collection, is more suitable for field data collection. However, the research on soil carbon content prediction based on hyperspectral images mainly focuses on feature extraction of spectral information, ignoring the spatial information, and cannot well reveal the intrinsic structural characteristics of data. Aiming at the lack of spatial features consideration in hyperspectral images, soil carbon content prediction methods based on multi-scale feature fusion are proposed by hyperspectral image. At the same time of extracting spectral features from hyperspectral images, the spatial information is used for the first time and a multi-scale spectral and spatial feature network (SpeSpaMN) is designed. In the SpeSpaMN, the multi-scale spectral feature network (SpeMN) is constructed to extract spectral features, the multi-scale spatial feature network (SpaMN) is constructed to extract spatial features. The two networks are fused by using the complementary relationship between different scale features to achieve soil carbon content prediction based on multi-scale feature fusion. The results showed that SpeSpaMN had the best results compared to other methods, followed by the method of SpeMN. The RPD of Inland, Aoshan Bay and Jiaozhou Bay samples based on SpeSpaMN were increased by 47.36%, 37.96% and 4.30% respectively. This paper can effectively solve the problem of the deep fusion of spatial and spectral features in the soil carbon content prediction by hyperspectral image, so as to improve the accuracy and stability of soil carbon content prediction.
- Research Article
106
- 10.1109/jstars.2021.3065987
- Jan 1, 2021
- IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing
This article proposes a novel hierarchical residual network with attention mechanism (HResNetAM) for hyperspectral image (HSI) spectral-spatial classification to improve the performance of conventional deep learning networks. The straightforward convolutional neural network-based models have limitations in exploiting the multiscale spatial and spectral features, and this is the key factor in dealing with the high-dimensional nonlinear characteristics present in HSIs. The proposed hierarchical residual network can extract multiscale spatial and spectral features at a granular level, so the receptive fields range of this network will be increased, which can enhance the feature representation ability of the model. Besides, we utilize the attention mechanism to set adaptive weights for spatial and spectral features of different scales, and this can further improve the discriminative ability of extracted features. Furthermore, the double branch structure is also exploited to extract spectral and spatial features with corresponding convolution kernels in parallel, and the extracted spatial and spectral features of multiple scales are fused for hyperspectral image classification. Four benchmark hyperspectral datasets collected by different sensors and at different acquisition time are employed for classification experiments, and comparative results reveal that the proposed method has competitive advantages in terms of classification performance when compared with other state-of-the-art deep learning models.
- Research Article
2
- 10.3390/technologies13080318
- Jul 23, 2025
- Technologies
This paper presents a novel cochlear implant (CI) sound coding strategy called Spectral Feature Extraction (SFE). The SFE is a novel Fast Fourier Transform (FFT)-based Continuous Interleaved Sampling (CIS) strategy that provides less-smeared spectral cues to CI patients compared to Crystalis, a predecessor strategy used in Oticon Medical devices. The study also explores how the SFE can be enhanced into a Temporal Fine Structure (TFS)-based strategy named Spectral Event Extraction (SEE), combining spectral sharpness with temporal cues. Background/Objectives: Many CI recipients understand speech in quiet settings but struggle with music and complex environments, increasing cognitive effort. De-smearing the power spectrum and extracting spectral peak features can reduce this load. The SFE targets feature extraction from spectral peaks, while the SEE enhances TFS-based coding by tracking these features across frames. Methods: The SFE strategy extracts spectral peaks and models them with synthetic pure tone spectra characterized by instantaneous frequency, phase, energy, and peak resemblance. This deblurs input peaks by estimating their center frequency. In SEE, synthetic peaks are tracked across frames to yield reliable temporal cues (e.g., zero-crossings) aligned with stimulation pulses. Strategy characteristics are analyzed using electrodograms. Results: A flexible Frequency Allocation Map (FAM) can be applied to both SFE and SEE strategies without being limited by FFT bandwidth constraints. Electrodograms of Crystalis and SFE strategies showed that SFE reduces spectral blurring and provides detailed temporal information of harmonics in speech and music. Conclusions: SFE and SEE are expected to enhance speech understanding, lower listening effort, and improve temporal feature coding. These strategies could benefit CI users, especially in challenging acoustic environments.
- Research Article
11
- 10.3390/rs15051302
- Feb 26, 2023
- Remote Sensing
Marine oil spills can cause serious damage to marine ecosystems and biological species, and the pollution is difficult to repair in the short term. Accurate oil type identification and oil thickness quantification are of great significance for marine oil spill emergency response and damage assessment. In recent years, hyperspectral remote sensing technology has become an effective means to monitor marine oil spills. The spectral and spatial features of oil spill images at different levels are different. To accurately identify oil spill types and quantify oil film thickness, and perform better extraction of spectral and spatial features, a multilevel spatial and spectral feature extraction network is proposed in this study. First, the graph convolutional neural network and graph attentional neural network models were used to extract spectral and spatial features in non-Euclidean space, respectively, and then the designed modules based on 2D expansion convolution, depth convolution, and point convolution were applied to extract feature information in Euclidean space; after that, a multilevel feature fusion method was developed to fuse the obtained spatial and spectral features in Euclidean space in a complementary way to obtain multilevel features. Finally, the multilevel features were fused at the feature level to obtain the oil spill information. The experimental results show that compared with CGCNN, SSRN, and A2S2KResNet algorithms, the accuracy of oil type identification and oil film thickness classification of the proposed method in this paper is improved by 12.82%, 0.06%, and 0.08% and 2.23%, 0.69%, and 0.47%, respectively, which proves that the method in this paper can effectively extract oil spill information and identify different oil spill types and different oil film thicknesses.
- Research Article
33
- 10.1109/tgrs.2021.3084922
- Jan 1, 2022
- IEEE Transactions on Geoscience and Remote Sensing
Recent progress in spectral classification is largely attributed to the use of convolutional neural networks (CNNs). While a variety of successful architectures have been proposed, they all extract spectral features from various portions of adjacent spectral bands. In this article, we take a different approach and develop a deep spectral feature fusion method, which extracts both local and interlocal spectral features, capturing thus also the correlations among nonadjacent bands. To our knowledge, this is the first reported deep spectral feature fusion method. Our model is a two-stream architecture, where an intergroup and a groupwise spectral classifier operate in parallel. The interlocal spectral correlation feature extraction is achieved elegantly, by reshaping the input spectral vectors to form the so-called nonadjacent spectral matrices. We introduce the concept of groupwise band convolution to enable the efficient extraction of discriminative local features with multiple kernels adopting the local spectral content. Another important contribution of this work is a novel dual-channel attention mechanism to identify the most informative spectral features. The model is trained in an end-to-end fashion with a joint loss. Experimental results on real datasets demonstrate excellent performance compared with the current state of the art.
- Research Article
19
- 10.1080/01431161.2018.1434324
- Apr 3, 2018
- International Journal of Remote Sensing
ABSTRACTWith the large number of spectral bands in hyperspectral images, the conventional classification methods commonly used for multispectral images are not effectively applicable. To overcome such difficulty, feature extraction methods could be used to reduce the dimension of hyperspectral images. In this study, the performance of the principal component analysis (PCA) as a widely used technique in feature extraction and the wavelet transform as a powerful decomposition tool on hyperspectral data is compared. In wavelet transform, a non-linear wavelet feature extraction was employed to select efficient features for more classification accuracy. Shortwave infrared bands of Hyperion imagery were selected as input data. The study area includes two well-known porphyry copper deposits, Darrehzar and Sarcheshmeh, located in the Iranian copper belt. Neural networks (NN), Support Vector Machine (SVM), and Spectral Angle Mapper (SAM) were used for multi-class classification based on hydrothermal alteration zones and then trained by mineral spectral features related to typical porphyry copper deposits. In the NN set-up used in this study, one hidden layer was used, with the number of neurons equal to the number of features in the input layer. Conjugate gradient backpropagation was employed as the network training function. Then, the efficiency of feature extraction methods was compared through their classification accuracies. According to the results, although the highest classification accuracy for the PCA method occurs in lower numbers of extracted features compared to wavelet transform, the wavelet transform outperforms the PCA, based on confusion matrix classification. Moreover, NN is stronger than SVM and SAM in discriminating favourable alteration zones associated with porphyry copper mineralization using hyperspectral images.
- Research Article
14
- 10.3390/electronics14040797
- Feb 18, 2025
- Electronics
In contrast to conventional remote sensing images, hyperspectral remote sensing images are characterized by a greater number of spectral bands and exceptionally high resolution. The richness of both spectral and spatial information facilitates the precise classification of various objects within the images, establishing hyperspectral imaging as indispensable for remote sensing applications. However, the labor-intensive and time-consuming process of labeling hyperspectral images results in limited labeled samples, while challenges like spectral similarity between different objects and spectral variation within the same object further complicate the development of classification algorithms. Therefore, efficiently exploiting the spatial and spectral information in hyperspectral images is crucial for accomplishing the classification task. To address these challenges, this paper presents a multi-scale feature fusion convolutional neural network (MSFF). The network introduces a dual branch spectral and spatial feature extraction module utilizing 3D depthwise separable convolution for joint spectral and spatial feature extraction, further refined by an attention-based-on-central-pixels (ACP) mechanism. Additionally, a spectral–spatial joint attention module (SSJA) is designed to interactively explore latent dependency between spectral and spatial information through the use of multilayer perceptron and global pooling operations. Finally, a feature fusion module (FF) and an adaptive multi-scale feature extraction module (AMSFE) are incorporated to enable adaptive feature fusion and comprehensive mining of feature information. Experimental results demonstrate that the proposed method performs exceptionally well on the IP, PU, and YRE datasets, delivering superior classification results compared to other methods and underscoring the potential and advantages of MSFF in hyperspectral remote sensing classification.