Abstract

High-resolution remote sensing image scene classification problem has attracted lots of attentions due to its crucial role in a wide range of applications. Recently, many convolutional neural network (CNN) based methods have significantly boosted the performance of image scene classification. However, most of these algorithms only make use of global features learned in the fully connected layers of a CNN, neglecting local features learned in the convolutional layers that is of crucial importance for some remote sensing scenes. Therefore, a sparse representation framework is proposed to make use of both local and global features learned in a CNN for classification of remote sensing scenes by balancing sparse representation based classifiers using these two kinds of features. Specially, in order to reduce redundancy of local features learned in a convolutional layer, the most effective local feature is generated from each convolutional layer using global average pooling and these selected local features of different convolutional layers are cascaded to form local feature representation of the scene. Finally, experimental results on UC-Merced and WHU-RS19 datasets demonstrate that fusing global and local features in a CNN using the proposed sparse representation framework can certainly improve the performance of classification using single kind of features. Moreover, the proposed global average pooling strategy is very effective to fuse local features select most representative features from convolutional layers of a CNN.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call