Abstract
Satellite scene classification is challenging because of the high variability inherent in satellite data. Although rapid progress in remote sensing techniques has been witnessed in recent years, the resolution of the available satellite images remains limited compared with the general images acquired using a common camera. On the other hand, a satellite image usually has a greater number of spectral bands than a general image, thereby permitting the multi-spectral analysis of different land materials and promoting low-resolution satellite scene recognition. This study advocates multi-spectral analysis and explores the middle-level statistics of spectral information for satellite scene representation instead of using spatial analysis. This approach is widely utilized in general image and natural scene classification and achieved promising recognition performance for different applications. The proposed multi-spectral analysis firstly learns the multi-spectral prototypes (codebook) for representing any pixel-wise spectral data, and then, based on the learned codebook, a sparse coded spectral vector can be obtained with machine learning techniques. Furthermore, in order to combine the set of coded spectral vectors in a satellite scene image, we propose a hybrid aggregation (pooling) approach, instead of conventional averaging and max pooling, which includes the benefits of the two existing methods, but avoids extremely noisy coded values. Experiments on three satellite datasets validated that the performance of our proposed approach is very impressive compared with the state-of-the-art methods for satellite scene classification.
Highlights
The rapid progress in remote sensing imaging techniques over the past decade has produced an explosive amount of remote sensing (RS) satellite images with different spatial resolutions and spectral coverage
In order to reduce the reconstruction error led by approximating a local feature with only one ”visual word” in K-means, several variant coding methods such as sparse coding (Sc), linear locality-constrained coordinate (LLC) [8,9,10,11] and the Gaussian mixture model (GMM) [12,13,14,15] have been explored in the BOW model for improving the reconstruction accuracy of local features, and some researchers further endeavored to integrate the spatial relationships of the local features
The main contributions of our work are two-fold: (1) unlike the spectral analysis in the unmixing model, which usually only obtains the sub-complete basis via simplex identification approaches, we investigate the over-complete dictionary for more accurate reconstruction of any pixel spectrum and obtain the reconstruction coefficients by using a sparse coding technique; (2) we generate the final representation of a satellite image from all coded sparse spectral vectors, for which we propose a generalized aggregation strategy
Summary
The rapid progress in remote sensing imaging techniques over the past decade has produced an explosive amount of remote sensing (RS) satellite images with different spatial resolutions and spectral coverage. This allows us to potentially study the ground surface of the Earth in greater detail. Deep learning frameworks have witnessed significant success in general object and natural scene understanding [23,24,25] and have been applied to remote sensing image classification [26,27,28,29,30] These framework perform impressively compared with the traditional BOW model. Zhong et al [32] proposed an agile convolution neural network (CNN) architecture, named SatCNN, for high-spatial resolution RS image scene classification, which used smaller kernel sizes for building the effective CNN architecture and validated promising performance
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.