The fusion of spectral–spatial features based on deep learning has become the focus of research in hyperspectral image (HSI) classification. However, previous deep frameworks based on spectral–spatial fusion usually performed feature aggregation only at the branch ends. Furthermore, only first-order statistical features are considered in the fusion process, which is not conducive to improving the discrimination of spectral–spatial features. This article proposes a global–local hierarchical weighted fusion end-to-end classification architecture. The architecture includes two subnetworks for spectral classification and spatial classification. For the spectral subnetwork, two band-grouping strategies are designed, and bidirectional long short-term memory is used to capture spectral context information from global to local perspectives. For the spatial subnetwork, a pooling strategy based on local attention is combined to construct a global–local pooling fusion module to enhance the discriminability of spatial features learned by a convolutional neural network. For the fusion stage, a hierarchical weighting fusion mechanism is developed to obtain the nonlinear relationship between both spectral and spatial features. The experimental results on four real HSI datasets and a GF-5 satellite dataset demonstrate that the method proposed is more competitive in terms of accuracy and generalization.
Read full abstract