Abstract

Deep learning, especially convolutional neural network (CNN), has been widely applied in remote sensing scene classification in recent years. However, such approaches tend to extract features from the whole image rather than discriminative regions. This article proposes a self-attention network model with joint loss that effectively reduce the interference of complex backgrounds and the impact of intra-class diversity in remote sensing image scene classification. In this model, self-attention mechanism is integrated into ResNet18 to extract discriminative features of images, which makes CNN focus on the most salient region of each image and suppress the interference of the irrelevant information. Moreover, for reducing the influence of intra-class diversity on scene classification, a joint loss function combining center loss with cross-entropy loss is proposed to further improve the accuracy of complex scene classification. Experiments carried out on AID, NWPU-NESISC45 and UC Merced datasets show that the overall accuracy of the proposed model is higher than that of most current competitive remote sensing image scene classification methods. It also performs well in the case of fewer data samples or complex backgrounds.

Highlights

  • In recent years, with the rapid development of remote sensing technology and sensor system, remote sensing image data are emerging continuously

  • (2) In order to decrease the effect of intra -class diversity on classification, we propose a joint loss function that

  • In order to improve the discriminative ability of the model and reduce the influence of intra -class diversity, we propose a joint loss function that combines the center loss with the cross-entropy loss

Read more

Summary

Introduction

With the rapid development of remote sensing technology and sensor system, remote sensing image data are emerging continuously. Most of the traditional classification methods are based on low-level or mid-level features [5,6], but these features are difficult to effectively describe the semantic information of remote sensing images, which makes the classification results unsatisfactory. Convolutional neural network is utilized by researchers to extract high-level semantic features from remote sensing image scene classification. Most of them use the pretrained convolutional neural network models, such as CaffeNet [14], GoogLeNet [15], and VGGNet [16], as a feature extractors for remote sensing image scene classification. A. FEATURE EXTRACTION According to the different semantic levels of the extracted features, the previous remote sensing image scene classification methods are mainly divided into three types: low-level features, mid-level features, and high-level fea tures

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call