Abstract

Remote sensing image scene classification is an important means for the understanding of remote sensing images. Convolutional neural networks (CNNs) have been successfully applied to remote sensing image scene classification and have demonstrated remarkable performance. However, with improvements in image resolution, remote sensing image categories are becoming increasingly diverse, and problems such as high intraclass diversity and high interclass similarity have arisen. The performance of ordinary CNNs at distinguishing increasingly complex remote sensing images is still limited. Therefore, we propose a feature fusion framework based on hierarchical attention and bilinear pooling called HABFNet for the scene classification of remote sensing images. First, the deep CNN ResNet50 is used to extract the deep features from different layers of the image, and these features are fused to boost their robustness and effectiveness. Second, we design an improved channel attention scheme to enhance the features from different layers. Finally, the enhanced features are cross-layer bilinearly pooled and fused, and the fused features are used for classification. Extensive experiments were conducted on three publicly available remote sensing image benchmarks. Comparisons with the state-of-the-art methods demonstrated that the proposed HABFNet achieved competitive classification performance.

Highlights

  • T HE tremendous progress in Earth observation technology has provided a steady stream of remote sensing image data for observing and understanding changes on the Earth’s surface [1]

  • Inspired by the fine-grained visual categorization and the attention mechanism in Convolutional neural networks (CNNs), for the scene classification task of a remote sensing image, we propose a feature fusion algorithm called hierarchical attention and bilinear fusion net (HABFNet), based on channel attention and hierarchical bilinear pooling (HBP) [24], which has shown excellent performance in fine-grained visual categorization

  • Inspired by the attention mechanism, we introduced an improved channel attention approach in our remote sensing image scene classification model to enhance the features extracted by the CNN

Read more

Summary

Introduction

T HE tremendous progress in Earth observation technology has provided a steady stream of remote sensing image data for observing and understanding changes on the Earth’s surface [1]. Full utilization of massive remote sensing data to promote effective analysis and understanding has become a popular and critical issue that must be urgently investigated. Its main task is to assign predefined category information to crops in large-scale remote sensing images [1], such as airports, ports, farmland, or residential areas. The category information is usually determined according to the function of the ground area, so that the same type of remote sensing image scene usually contains multiple. Manuscript received July 23, 2020; revised September 7, 2020; accepted October 2, 2020. Date of publication October 13, 2020; date of current version November 2, 2020.

Methods
Results
Discussion
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.