Synergistic 2D/3D Convolutional Neural Network for Hyperspectral Image Classification

  • Abstract
  • Highlights & Summary
  • PDF
  • Literature Map
  • Similar Papers
Abstract
Translate article icon Translate Article Star icon
Take notes icon Take Notes

Accurate hyperspectral image classification has been an important yet challenging task for years. With the recent success of deep learning in various tasks, 2-dimensional (2D)/3-dimensional (3D) convolutional neural networks (CNNs) have been exploited to capture spectral or spatial information in hyperspectral images. On the other hand, few approaches make use of both spectral and spatial information simultaneously, which is critical to accurate hyperspectral image classification. This paper presents a novel Synergistic Convolutional Neural Network (SyCNN) for accurate hyperspectral image classification. The SyCNN consists of a hybrid module that combines 2D and 3D CNNs in feature learning and a data interaction module that fuses spectral and spatial hyperspectral information. Additionally, it introduces a 3D attention mechanism before the fully-connected layer which helps filter out interfering features and information effectively. Extensive experiments over three public benchmarking datasets show that our proposed SyCNNs clearly outperform state-of-the-art techniques that use 2D/3D CNNs.

Similar Papers
  • Conference Article
  • 10.1109/icicsp55539.2022.10050698
Lightweight Multilevel Feature Fusion Network for Hyperspectral Image Classification
  • Nov 26, 2022
  • Quanyu Huang + 3 more

Hyperspectral image (HSI) classification is the key technology of remote sensing image processing. In recent years, convolutional neural network (CNN), which is a powerful feature extractor, has been introduced into the field of HSI classification. Since the features of HSI are the basis of HSI classification, how to effectively extract the spectral-spatial features from HSI with CNN has become a research hotspot. The HSI feature extraction network, based on two-dimensional (2D) and three-dimensional (3D) CNN which can extract both spectral and spatial information, may lead to the increase of parameters and computational cost. Compared with 2D CNN and 3D CNN, the number of parameters and computational cost of one-dimensional (1D) CNN will be greatly reduced. However, 1D CNN based algorithms can only extract the spectral information without considering the spatial information. Therefore, in this paper, a lightweight multilevel feature fusion network (LMFFN) is proposed for HSI classification, which aims to achieve efficient extraction of spectral-spatial features and to minimize the number of parameters. The main contributions of this paper are divided into the following two points: First, we design a hybrid spectral-spatial feature extraction network (HSSFEN) to combine the advantages of 1D, 2D and 3D CNN. It introduces the idea of depthwise separable convolution method, which effectively reduces the complexity of the proposed HSSFEN. Then, a multilevel spectral-spatial feature fusion network (MSSFFN) is proposed to further obtain more effective spectral-spatial features, which effectively fuses the bottom spectral-spatial features and the top spectral-spatial features. To demonstrate the performance of our proposed method, a series of experiments are conducted on three HSI datasets, including Indian Pine, University of Pavia, and Salinas Scene datasets. The experimental results indicate that our proposed LMFFN is able to achieve better performance than the manual feature extraction methods and deep learning methods, which demonstrates the superiority of our proposed method.

  • Research Article
  • Cite Count Icon 2
  • 10.1364/josaa.478585
Hybrid spatial-spectral generative adversarial network for hyperspectral image classification.
  • Feb 21, 2023
  • Journal of the Optical Society of America A
  • Chao Ma + 5 more

In recent years, generative adversarial networks (GNAs), consisting of two competing 2D convolutional neural networks (CNNs) that are used as a generator and a discriminator, have shown their promising capabilities in hyperspectral image (HSI) classification tasks. Essentially, the performance of HSI classification lies in the feature extraction ability of both spectral and spatial information. The 3D CNN has excellent advantages in simultaneously mining the above two types of features but has rarely been used due to its high computational complexity. This paper proposes a hybrid spatial-spectral generative adversarial network (HSSGAN) for effective HSI classification. The hybrid CNN structure is developed for the construction of the generator and the discriminator. For the discriminator, the 3D CNN is utilized to extract the multi-band spatial-spectral feature, and then we use the 2D CNN to further represent the spatial information. To reduce the accuracy loss caused by information redundancy, a channel and spatial attention mechanism (CSAM) is specially designed. To be specific, a channel attention mechanism is exploited to enhance the discriminative spectral features. Furthermore, the spatial self-attention mechanism is developed to learn the long-term spatial similarity, which can effectively suppress invalid spatial features. Both quantitative and qualitative experiments implemented on four widely used hyperspectral datasets show that the proposed HSSGAN has a satisfactory classification effect compared to conventional methods, especially with few training samples.

  • Research Article
  • Cite Count Icon 10
  • 10.1109/tgrs.2023.3282247
A Lightweight Hybrid Convolutional Neural Network for Hyperspectral Image Classification
  • Jan 1, 2023
  • IEEE Transactions on Geoscience and Remote Sensing
  • Xiaohu Ma + 6 more

Recent studies have demonstrated the potential of hybrid convolutional models that combine 3D and 2D convolutional neural networks (CNNs) for hyperspectral image (HSI) classification. However, these models do not fully utilize the benefits of hybrid convolution due to inefficient connections between the two types of CNNs. Moreover, most CNNs, including hybrid models, require a significant number of parameters and computational resources for accurate classification, which increases the need for labeled samples and computational cost. Although the common lightweight strategies like depthwise separable convolution (DSC) can reduce parameters and computation compared to normal convolution (NC), they often compromise accuracy. To address these challenges, we propose a lightweight hybrid convolutional neural network (Lite-HCNet) for HSI classification with minimal model parameters and computational effort. Firstly, we design a novel channel attention module (NCAM) and combine it with a convolutional kernel decomposition (CKD) strategy to propose a lightweight and efficient DSC (LE-DSC) deployed in Lite-HCNet. The LE-DSC not only reduces the DSC volume further but also enhances its performance. Secondly, a lightweight and efficient hybrid convolutional layer (LE-HCL) is designed in Lite-HCNet to explore the efficient connection structure between 3D CNNs and 2D CNNs. Experiments show that the Lite-HCNet reduces the required computational cost and practical deployment difficulty while offering advanced performance with a small number of training samples. Furthermore, abundant ablation experiments confirm the superior performance of the designed LE-DSC.

  • Research Article
  • 10.18698/0236-3933-2022-1-100-118
Классификация гиперспектральных данных дистанционного зондирования Земли с использованием комбинированных 3D--2D сверточных нейронных сетей
  • Mar 1, 2022
  • Herald of the Bauman Moscow State Technical University. Series Instrument Engineering
  • L.T Nyan + 2 more

Hyperspectral image classification is used for analyzing remote Earth sensing data. Convolutional neural network is one of the most commonly used methods for processing visual data based on deep learning. The article considers the proposed hybrid 3D--2D spectral convolutional neural network for hyperspectral image classification. At the initial stage, a simple combined trained deep learning model was proposed, which was constructed by combining 2D and 3D convolutional neural networks to extract deeper spatial-spectral features with fewer 3D--2D convolutions. The 3D network facilitates the joint spatial-spectral representation of objects from a stack of spectral bands. Functions of 3D--2D convolutional neural networks were used for classifying hyperspectral images. The algorithm of the method of principal components is applied to reduce the dimension. Hyperspectral image classification experiments were performed on Indian Pines, University of Pavia and Salinas Scene remote sensing datasets. The first layer of the feature map is used as input for subsequent layers in predicting final labels for each hyperspectral pixel. The proposed method not only includes the benefits of advanced feature extraction from convolutional neural networks, but also makes full use of spectral and spatial information. The effectiveness of the proposed method was tested on three reference data sets. The results show that a multifunctional learning system based on such networks significantly improves classification accuracy (more than 99 %)

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 11
  • 10.3390/rs13214472
Deep Spectral Spatial Inverted Residual Network for Hyperspectral Image Classification
  • Nov 7, 2021
  • Remote Sensing
  • Tianyu Zhang + 3 more

Convolutional neural networks (CNNs) have been widely used in hyperspectral image classification in recent years. The training of CNNs relies on a large amount of labeled sample data. However, the number of labeled samples of hyperspectral data is relatively small. Moreover, for hyperspectral images, fully extracting spectral and spatial feature information is the key to achieve high classification performance. To solve the above issues, a deep spectral spatial inverted residuals network (DSSIRNet) is proposed. In this network, a data block random erasing strategy is introduced to alleviate the problem of limited labeled samples by data augmentation of small spatial blocks. In addition, a deep inverted residuals (DIR) module for spectral spatial feature extraction is proposed, which locks the effective features of each layer while avoiding network degradation. Furthermore, a global 3D attention module is proposed, which can realize the fine extraction of spectral and spatial global context information under the condition of the same number of input and output feature maps. Experiments are carried out on four commonly used hyperspectral datasets. A large number of experimental results show that compared with some state-of-the-art classification methods, the proposed method can provide higher classification accuracy for hyperspectral images.

  • Research Article
  • Cite Count Icon 2
  • 10.3390/rs16224202
SSFAN: A Compact and Efficient Spectral-Spatial Feature Extraction and Attention-Based Neural Network for Hyperspectral Image Classification
  • Nov 11, 2024
  • Remote Sensing
  • Chunyang Wang + 6 more

Hyperspectral image (HSI) classification is a crucial technique that assigns each pixel in an image to a specific land cover category by leveraging both spectral and spatial information. In recent years, HSI classification methods based on convolutional neural networks (CNNs) and Transformers have significantly improved performance due to their strong feature extraction capabilities. However, these improvements often come with increased model complexity, leading to higher computational costs. To address this, we propose a compact and efficient spectral-spatial feature extraction and attention-based neural network (SSFAN) for HSI classification. The SSFAN model consists of three core modules: the Parallel Spectral-Spatial Feature Extraction Block (PSSB), the Scan Block, and the Squeeze-and-Excitation MLP Block (SEMB). After preprocessing the HSI data, it is fed into the PSSB module, which contains two parallel streams, each comprising a 3D convolutional layer and a 2D convolutional layer. The 3D convolutional layer extracts spectral and spatial features from the input hyperspectral data, while the 2D convolutional layer further enhances the spatial feature representation. Next, the Scan Block module employs a layered scanning strategy to extract spatial information at different scales from the central pixel outward, enabling the model to capture both local and global spatial relationships. The SEMB module combines the Spectral-Spatial Recurrent Block (SSRB) and the MLP Block. The SSRB, with its adaptive weight assignment mechanism in the SToken Module, flexibly handles time steps and feature dimensions, performing deep spectral and spatial feature extraction through multiple state updates. Finally, the MLP Block processes the input features through a series of linear transformations, GELU activation functions, and Dropout layers, capturing complex patterns and relationships within the data, and concludes with an argmax layer for classification. Experimental results show that the proposed SSFAN model delivers superior classification performance, outperforming the second-best method by 1.72%, 5.19%, and 1.94% in OA, AA, and Kappa coefficient, respectively, on the Indian Pines dataset. Additionally, it requires less training and testing time compared to other state-of-the-art deep learning methods.

  • Book Chapter
  • 10.1007/978-981-99-0085-5_33
A Hybrid Approach Using Wavelet and 2D Convolutional Neural Network for Hyperspectral Image Classification
  • Jan 1, 2023
  • Apoorv Joshi + 5 more

Hyperspectral Image (HSI) classification is used for the examination of images that are captured without being present physically at the site of image. In recent years, lots of new approaches have been proposed for HSI image classification. Convolutional Neural Network (CNN) is very popular among them and is widely used nowadays because of its ability to extract critical features and has good performance. SVMs, 2D CNNs, 3D CNNs and 3D-2D CNNs are some of the methods used for feature extraction. 3D CNNs are complex to compute whereas 2D CNNs focus on spatial information but do not involve multi-resolution image processing. This model uses the variation of 2D CNN named as wavelet CNN for the classification of Hyperspectral Images. To test the performance of this hybrid model, Salinas scene dataset is used. The proposed model for HSI classification provides good results, in which we got overall accuracy, average accuracy and kappa score values as 99.87%, 99.88% and 99.85%, respectively.

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 16
  • 10.3390/rs13183590
A Spectral Spatial Attention Fusion with Deformable Convolutional Residual Network for Hyperspectral Image Classification
  • Sep 9, 2021
  • Remote Sensing
  • Tianyu Zhang + 3 more

Convolutional neural networks (CNNs) have exhibited excellent performance in hyperspectral image classification. However, due to the lack of labeled hyperspectral data, it is difficult to achieve high classification accuracy of hyperspectral images with fewer training samples. In addition, although some deep learning techniques have been used in hyperspectral image classification, due to the abundant information of hyperspectral images, the problem of insufficient spatial spectral feature extraction still exists. To address the aforementioned issues, a spectral–spatial attention fusion with a deformable convolution residual network (SSAF-DCR) is proposed for hyperspectral image classification. The proposed network is composed of three parts, and each part is connected sequentially to extract features. In the first part, a dense spectral block is utilized to reuse spectral features as much as possible, and a spectral attention block that can refine and optimize the spectral features follows. In the second part, spatial features are extracted and selected by a dense spatial block and attention block, respectively. Then, the results of the first two parts are fused and sent to the third part, and deep spatial features are extracted by the DCR block. The above three parts realize the effective extraction of spectral–spatial features, and the experimental results for four commonly used hyperspectral datasets demonstrate that the proposed SSAF-DCR method is superior to some state-of-the-art methods with very few training samples.

  • Research Article
  • Cite Count Icon 9
  • 10.1049/ipr2.12632
Hybrid network model based on 3D convolutional neural network and scalable graph convolutional network for hyperspectral image classification
  • Sep 25, 2022
  • IET Image Processing
  • Xili Wang + 1 more

Hyperspectral images (HSIs) contain hundreds of continuous spectral bands and are rich in spectral‐spatial information. In terms of HSIs’ classification, traditional convolutional neural networks (CNNs) extract features based on HSI's spectral‐spatial information through 2D convolution. However, 2D convolution extracts features in 2D plane without considering the relationships between spectral bands, which inevitably leads to insufficient feature extraction. 3D convolutional neural networks (3DCNNs) take account of the correlations among spectral bands and outperform 2D convolutional networks in feature extraction, but the computational cost is rather expensive. To address the above problem, a light‐weight three‐layer 3D convolutional network Module (3D‐M) for HSIs’ spectral‐spatial feature extraction is proposed. Another challenge is that neither 2D convolution nor 3D convolution utilizes the structural information inherent in the data. Graph convolution networks (GCNs) can model and utilize such information through the similarity matrix, also known as adjacency matrix. However, traditional GCNs cannot handle large‐scale data because they construct adjacency matrix on all data, which results in high computational complexity and large storage requirement. To conquer this challenge, this article proposes a batch‐graph strategy on which a scalable GCN is developed. Finally, a hybrid network model (HNM) based on the proposed light‐weight 3D‐M and scalable GCN is presented. HNM extracts spectral‐spatial features of HSIs with low computational complexity through the light‐weight 3D convolution network and leverages the structural information in data via the scalable GCN. The experimental results on three public datasets with different sizes demonstrate that the proposed HNM produces better classification results than other state‐of‐the‐art hyperspectral images classification models in terms of overall accuracy (OA), average accuracy (AA) and kappa coefficient (Kappa).

  • Conference Article
  • Cite Count Icon 1
  • 10.1109/igarss46834.2022.9883452
Markov Random Field Based Spectral-Spatial Fusion Network for Hyperspectral Image Classification
  • Jul 17, 2022
  • Yao Peng + 1 more

In hyperspectral image (HSI) classification task, effectively deriving and incorporating spatial information into spectral features is one of a key focus as it can largely influence the performance. Markov random fields (MRFs) are generative and flexible image texture models, and capable of effectively extracting spatial neighbourhood information along multiple spectral wavebands in an unsupervised way. Its parameter estimation process also shares strong compatibility with deep architecture, especially the convolutional neural networks. In this work, we propose an MRF based spectral-spatial fusion network (SSFNet) for HSI classification. Spatial features are extracted using MRF models and further fused with spectral information. Then the proposed SSFNet takes the fused features as input and produces reliable classification results. Comprehensive experiments conducted on the Indian pines and the Pavia university datasets are reported to verify the proposed method.

  • Research Article
  • Cite Count Icon 19
  • 10.1109/tgrs.2022.3180685
Grafting Transformer on Automatically Designed Convolutional Neural Network for Hyperspectral Image Classification
  • Jan 1, 2022
  • IEEE Transactions on Geoscience and Remote Sensing
  • Xizhe Xue + 4 more

Hyperspectral image (HSI) classification has been a hot topic for decides, as hyperspectral images have rich spatial and spectral information and provide strong basis for distinguishing different land-cover objects. Benefiting from the development of deep learning technologies, deep learning based HSI classification methods have achieved promising performance. Recently, several neural architecture search (NAS) algorithms have been proposed for HSI classification, which further improve the accuracy of HSI classification to a new level. In this paper, NAS and Transformer are combined for handling HSI classification task for the first time. Compared with previous work, the proposed method has two main differences. First, we revisit the search spaces designed in previous HSI classification NAS methods and propose a novel hybrid search space, consisting of the space dominated cell and the spectrum dominated cell. Compared with search spaces proposed in previous works, the proposed hybrid search space is more aligned with the characteristic of HSI data, that is, HSIs have a relatively low spatial resolution and an extremely high spectral resolution. Second, to further improve the classification accuracy, we attempt to graft the emerging transformer module on the automatically designed convolutional neural network (CNN) to add global information to local region focused features learned by CNN. Experimental results on three public HSI datasets show that the proposed method achieves much better performance than comparison approaches, including manually designed network and NAS based HSI classification methods. Especially on the most recently captured dataset Houston University, overall accuracy is improved by nearly 6 percentage points. Code is available at: https://github.com/Cecilia-xue/HyT-NAS.

  • Research Article
  • Cite Count Icon 12
  • 10.1080/01431161.2023.2249598
CNN and Transformer interaction network for hyperspectral image classification
  • Sep 8, 2023
  • International Journal of Remote Sensing
  • Zhongwei Li + 4 more

Convolutional Neural Network (CNN) has developed hyperspectral image (HSI) classification effectively. Although many CNN-based models can extract local features in HSI, it is difficult for them to extract global features. With its ability to capture long-range dependencies, Transformer is gradually gaining prominence in HSI classification, but it may overlook some local details when extracting features. To address these issues, we proposed a CNN and transformer interaction network (CTIN) for HSI classification. Firstly, A dual-branch structure was constructed in which CNN and Transformer are arranged in parallel to simultaneously extract global features and local features in HSI. Secondly, a feature interaction module has been imported between the two branches, thus facilitating a bi-directional flow of information between the global and local feature spaces. In this way, the network structure combines the advantages of CNN and Transformer in extracting features as much as possible. In addition, a token generation method is designed to harness abundant contextual information that is relevant to the centre pixel, and improve the accuracy of the final classification. Experiments were conducted on four hyperspectral datasets (two classical datasets – Indian Pines, Salinas Valley, a new satellite dataset – Yellow River, and an self-made UAV dataset-Yellow River Willow). Experimental results show that the proposed method outperforms the other state-of-the-art methods, with overall accuracies of 99.21%, 99.61%, 92.40%, and 98.17%, respectively.

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 10
  • 10.3390/rs15071758
Multi-Scale Spectral-Spatial Attention Network for Hyperspectral Image Classification Combining 2D Octave and 3D Convolutional Neural Networks
  • Mar 24, 2023
  • Remote Sensing
  • Lianhui Liang + 4 more

Traditional convolutional neural networks (CNNs) can be applied to obtain the spectral-spatial feature information from hyperspectral images (HSIs). However, they often introduce significant redundant spatial feature information. The octave convolution network is frequently utilized instead of traditional CNN to decrease spatial redundant information of the network and extend its receptive field. However, the 3D octave convolution-based approaches may introduce extensive parameters and complicate the network. To solve these issues, we propose a new HSI classification approach with a multi-scale spectral-spatial network-based framework that combines 2D octave and 3D CNNs. Our method, called MOCNN, first utilizes 2D octave convolution and 3D DenseNet branch networks with various convolutional kernel sizes to obtain complex spatial contextual feature information and spectral characteristics, separately. Moreover, the channel and the spectral attention mechanisms are, respectively, applied to these two branch networks to emphasize significant feature regions and certain important spectral bands that comprise discriminative information for the categorization. Furthermore, a sample balancing strategy is applied to address the sample imbalance problem. Expansive experiments are undertaken on four HSI datasets, demonstrating that our MOCNN approach outperforms several other methods for HSI classification, especially in scenarios dominated by limited and imbalanced sample data.

  • Research Article
  • Cite Count Icon 9
  • 10.1080/17538947.2023.2300319
SSC-SFN: spectral-spatial non-local segment federated network for hyperspectral image classification with limited labeled samples
  • Jan 9, 2024
  • International Journal of Digital Earth
  • Quanshan Gao + 2 more

Hyperspectral image (HSI) classification methods based on deep learning (DL) have performed well in numerous investigations. Although many modified superpixel-wise neural networks are utilized to enhance spatial information, their ability to mine spectral information in graph structures is insufficient. Moreover, single classifier approaches are unable to extract adequate spatial and spectral information simultaneously. For the classification of large-scale research areas, many works have relied on the use of a large number of labeled samples, leading to low efficiency and weak generalization. To address these issues, an effective spectral-spatial HSI classification approach based on spectral-spatial non-local segment federated network (SSC-SFN) was developed in this study. In this framework, deconvolution is employed to recover the data size, while the lost spatial information is replaced by up-pooling. The spectral dimensional features are updated through the generation of non-Euclidean graph structures and the non-local segment smoothing technique. The convolutional neural network and graph convolutional network techniques are coupled to exploit the available spectral and spatial structure information fully. Extensive experimental results obtained using four public benchmark datasets show that the classification accuracy of SSC-SFN can exceed 90% for large-scale HSIs with limited samples.

  • Research Article
  • Cite Count Icon 83
  • 10.1109/tgrs.2023.3265879
Attention Multihop Graph and Multiscale Convolutional Fusion Network for Hyperspectral Image Classification
  • Jan 1, 2023
  • IEEE Transactions on Geoscience and Remote Sensing
  • Hao Zhou + 5 more

Convolutional neural networks (CNNs) for hyperspectral image (HSI) classification have generated good progress. Meanwhile, graph convolutional networks (GCNs) have also attracted considerable attention by using unlabeled data, broadly and explicitly exploiting correlations between adjacent parcels. However, the CNN with a fixed square convolution kernel is not flexible enough to deal with irregular patterns, while the GCN using the superpixel to reduce the number of nodes will lose the pixel-level features, and the features from the two networks are always partial. In this paper, to make good use of the advantages of CNN and GCN, we propose a novel multiple feature fusion model termed attention multi-hop graph and multi-scale convolutional fusion network (AMGCFN), which includes two sub-networks of multi-scale fully CNN and multi-hop GCN to extract the multi-level information of HSI. Specifically, the multi-scale fully CNN aims to comprehensively capture pixel-level features with different kernel sizes, and a multi-head attention fusion module is used to fuse the multi-scale pixel-level features. The multi-hop GCN systematically aggregates the multi-hop contextual information by applying multi-hop graphs on different layers to transform the relationships between nodes, and a multi-head attention fusion module is adopted to combine the multi-hop features. Finally, we design a cross attention fusion module to adaptively fuse the features of two sub-networks. AMGCFN makes full use of multi-scale convolution and multi-hop graph features, which is conducive to the learning of multi-level contextual semantic features. Experimental results on three benchmark HSI datasets show that AMGCFN has better performance than a few state-of-the-art methods.

Save Icon
Up Arrow
Open/Close
  • Ask R Discovery Star icon
  • Chat PDF Star icon
Setting-up Chat
Loading Interface