Abstract

Remote sensing image scene classification is challenging due to the complicated spatial arrangement and varied object sizes inside a large-scale aerial image. Among the bottlenecks for current deep learning methods to depict and discriminate the complexity of remote sensing scenes, strengthening the local semantic representation and multi-scale feature representation is necessary. In this paper, we propose a multi-scale staking attention pooling (MS2AP) to tackle these challenges, which has three main contributions. Firstly, it can be conveniently embedded into current CNN models in an end-to-end manner to enhance the feature representation capability for remote sensing scenes. Secondly, we propose a novel residual channel-spatial attention module to mine the key local semantics in the feature maps. Compared with current attention modules, it can fuse top-down discriminative features and bottom-up convolution features from both the channel and spatial domain. Thirdly, we propose a multi-scale dilated convolutional operator which can extract multi-scale feature maps and keep their sizes the same. In our MS2AP, these multi-scale feature maps are firstly staked and then down-sampled by a weighted pooling whose weight matrix comes from our attention module. Extensive experiments demonstrate that our MS2AP outperforms the baseline by 4.24% on UCM, 7.22% on AID and 14.12% on NWPU benchmark respectively, and substantially outperforms current state-of-the-art methods by a large margin.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.