Deep Residual Involution Network for Hyperspectral Image Classification

Zhe Meng,Feng Zhao,Wen Xie,Miaomiao Liang

doi:10.3390/rs13163055

Abstract

Convolutional neural networks (CNNs) have achieved great results in hyperspectral image (HSI) classification in recent years. However, convolution kernels are reused among different spatial locations, known as spatial-agnostic or weight-sharing kernels. Furthermore, the preference of spatial compactness in convolution (typically, 3×3 kernel size) constrains the receptive field and the ability to capture long-range spatial interactions. To mitigate the above two issues, in this article, we combine a novel operation called involution with residual learning and develop a new deep residual involution network (DRIN) for HSI classification. The proposed DRIN could model long-range spatial interactions well by adopting enlarged involution kernels and realize feature learning in a fairly lightweight manner. Moreover, the vast and dynamic involution kernels are distinct over different spatial positions, which could prioritize the informative visual patterns in the spatial domain according to the spectral information of the target pixel. The proposed DRIN achieves better classification results when compared with both traditional machine learning-based and convolution-based methods on four HSI datasets. Especially in comparison with the convolutional baseline model, i.e., deep residual network (DRN), our involution-powered DRIN model increases the overall classification accuracy by 0.5%, 1.3%, 0.4%, and 2.3% on the University of Pavia, the University of Houston, the Salinas Valley, and the recently released HyRANK HSI benchmark datasets, respectively, demonstrating the potential of involution for HSI classification.

Highlights

Hyperspectral images (HSIs) are three-dimensional (3D) data with hundreds of spectral bands, which contain both spatial information and approximately continuous spectral information
We propose a spectral feature-based dynamic involution kernel generation function, which could adaptively allocate weights over different spatial positions, prioritizing informative visual patterns in the spatial extent according to the spectral information of target pixel
The proposed deep residual involution network (DRIN) was compared with two traditional machine learning methods, namely, support vector machine (SVM) [11] and extended morphological profiles (EMP) [8], and seven convolutionbased networks, namely, deep&dense Convolutional neural networks (CNNs) (DenseNet) [57], deep pyramidal residual network (DPRN) [55], fully dense multiscale fusion network (FDMFN) [58], multiscale residual network with mixed depthwise convolution (MSRN) [32], lightweight spectralspatial convolution module-based residual network (LWRN) [34], spatial-spectral squeezeand-excitation residual network (SSSERN) [60], and deep residual network (DRN) [64]

Summary

Introduction

Hyperspectral images (HSIs) are three-dimensional (3D) data with hundreds of spectral bands, which contain both spatial information and approximately continuous spectral information. The abundant spatial-spectral information offers the opportunity for accurate discrimination of diverse materials of interest in the observed scenes. HSIs have been applied in many fields related to Earth observation (EO), such as geological exploration [1,2], precision agriculture [3], and environmental monitoring [4]. Classification is a basic and important technique in the field of HSI processing, which aims to identify the land-cover category of each pixel in the HSI [5,6,7]. In the early approaches, handcrafted features [8,9] are first extracted from HSIs and classified using traditional classifiers, e.g., support vector machine (SVM) [10,11]. The feature extraction and classification are implemented separately, and the adaptability between these two processes is not fully considered

Methods

Results

Discussion

Conclusion