Abstract

A deep neural network is suitable for remote sensing image pixel-wise classification because it effectively extracts features from the raw data. However, remote sensing images with higher spatial resolution exhibit smaller inter-class differences and greater intra-class differences; thus, feature extraction becomes more difficult. The attention mechanism, as a method that simulates the manner in which humans comprehend and perceive images, is useful for the quick and accurate acquisition of key features. In this study, we propose a novel neural network that incorporates two kinds of attention mechanisms in its mask and trunk branches; i.e., control gate (soft) and feedback attention mechanisms, respectively, based on the branches’ primary roles. Thus, a deep neural network can be equipped with an attention mechanism to perform pixel-wise classification for very high-resolution remote sensing (VHRRS) images. The control gate attention mechanism in the mask branch is utilized to build pixel-wise masks for feature maps, to assign different priorities to different locations on different channels for feature extraction recalibration, to apply stress to the effective features, and to weaken the influence of other profitless features. The feedback attention mechanism in the trunk branch allows for the retrieval of high-level semantic features. Hence, additional aids are provided for lower layers to re-weight the focus and to re-update higher-level feature extraction in a target-oriented manner. These two attention mechanisms are fused to form a neural network module. By stacking various modules with different-scale mask branches, the network utilizes different attention-aware features under different local spatial structures. The proposed method is tested on the VHRRS images from the BJ-02, GF-02, Geoeye, and Quickbird satellites, and the influence of the network structure and the rationality of the network design are discussed. Compared with other state-of-the-art methods, our proposed method achieves competitive accuracy, thereby proving its effectiveness.

Highlights

  • Image classification for very high-resolution remote sensing images (VHRRSI) is an important aspect of efficient and effective earth observation information extraction

  • For improved feature extraction and higher-accuracy VHRRSI pixel-wise classification, we propose a novel attention mechanism involving a neural network for multi-scale spatial and spectral information

  • The soft attention mechanism and feedback attention mechanism form the control gate mask and trunk branches, respectively, and the point-wise multiplication of these two branches enables the fusion of these two attention mechanisms

Read more

Summary

Introduction

Image classification for very high-resolution remote sensing images (VHRRSI) is an important aspect of efficient and effective earth observation information extraction. Assigning labels to each pixel of a VHRRSI, which is called per-pixel or pixel-wise classification, is of great importance and considered to be the basis for land mapping, image understanding, contour detection, object extraction, and so on [1,2,3,4]. Deep networks have been successfully employed for feature extraction of remote sensing images in many studies [9,10,11,12,13,14,15], outperforming other conventional methods. Finer-resolution acquired remote sensing images yield improved ground-object perception [16]. The inter-class and intra-class variation make it difficult for land object classification [17]

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call