An Encoder–Decoder Convolution Network With Fine-Grained Spatial Information for Hyperspectral Images Classification

Zhongwei Li,Fangming Guo,Leiquan Wang,Guangbo Ren,Qi Li

doi:10.1109/access.2020.2974025

Abstract

Convolutional Neural Network (CNN) is widely used in Hyperspectral Images (HSIs) classification. However, the fine-grained spatial (FGS) details are discarded during a sequence of convolution and pooling operations for most of CNN-based HSIs classification methods. To address this issue, a unified encoder-decoder framework is proposed to integrate high-level semantics and FGS details for HSIs classification, denoted by FGSCNN. The encoder, including a series of convolution and pooling layers, captures the high-level semantic information with low resolution feature maps. The decoder fuses the high-level low-resolution semantic and the fine-grained high-resolution spatial information, namely, to get the FGS features with high-level semantics. The deconvolution layers and skip connection are used in the decoder to retain the FGS details, while, convolution layers are also used to combine the FGS features with high-level semantics. Based on the encoder-decoder framework, a unified loss function is exploited to integrate the high-level semantic information and FGS details with an end-to-end manner for HSIs classification. Experiments conducted on the three public datasets, i.e. the Indian Pines, Pavia University and Salinas, demonstrate the effectiveness of the proposed method on HSIs classification.

Highlights

Hyperspectral Images (HSIs) contain a great deal of spatial geometric information and spectral information reflecting various characteristics of ground objects
The fine-grained spatial (FGS) details are discarded during a sequence of convolution and pooling operations for most of Convolutional Neural Network (CNN)-based HSIs classification
The FGS details are discarded during a sequence of convolution and pooling operations for HSIs classification

Summary

INTRODUCTION

Hyperspectral Images (HSIs) contain a great deal of spatial geometric information and spectral information reflecting various characteristics of ground objects. With the development of Convolutional Neural Network (CNN), various CNN architectures [10]–[13] are performed on HSIs to extract high-level spectral, spatial and spectralspatial features [14]–[16], such as Google Inception [17], VGG, ResNet [18] and DenseNet [19] These CNN-based methods [20]–[22] made an end-to-end training process with the supervision of high-level class labels [23]–[26]. By considering the issues mentioned above, we aimed at building a unified encoder-decoder framework to integrate high-level semantics and FGS details for HSIs classification, namely FGS based CNN (FGSCNN) In this method, the encoder is used to capture the high-level semantic information with low resolution feature maps. The pre-trained FCN model on natural image dataset cannot retain the high-resolution spectrum information in HSIs. In this paper, a unified encoder-decoder network is builded to extract the FGS features with high-level semantics for HSIs classification

METHODS

THE ENCODER ARCHITECTURE OF FGSCNN

LOSS FUNCTION

Findings

CONCLUSION