Abstract

Deep learning technology (a deeper and optimized network structure) and remote sensing imaging (i.e., the more multisource and the more multicategory remote sensing data) have developed rapidly. Although the deep convolutional neural network (CNN) has achieved state-of-the-art performance on remote sensing image (RSI) scene classification, the existence of adversarial attacks poses a potential security threat to the RSI scene classification task based on CNN. The corresponding adversarial samples can be generated by adding a small perturbation to the original images. Feeding the CNN-based classifier with the adversarial samples leads to the classifier misclassify with high confidence. To achieve a higher attack success rate against scene classification based on CNN, we introduce the projected gradient descent method to generate adversarial remote sensing images. Then, we select several mainstream CNN-based classifiers as the attacked models to demonstrate the effectiveness of our method. The experimental results show that our proposed method can dramatically reduce the classification accuracy under untargeted and targeted attacks. Furthermore, we also evaluate the quality of the generated adversarial images by visual and quantitative comparisons. The results show that our method can generate the imperceptible adversarial samples and has a stronger attack ability for the RSI scene classification.

Highlights

  • With the high-speed development of remote sensing technology, we can obtain more multisource and multicategory remote sensing images (RSIs). ose images are more efficiently employed to observe ground objects

  • Erefore, employing RSI to achieve scene classification has gathered considerable attention. e purpose of scene classification is to predict the label for a given RSI, e.g., airport, forest, and river. e critical process of RSI scene classification is to extract image features. e existing researches can be mainly divided into two strategies according to the features used: (1) the low-level features and the middle-level global features obtained by the manual extraction and (2) the high-level features extracted automatically by the convolutional neural network (CNN)

  • OA gap is the gap between the classification accuracy of the clean test set and that of the Xu2020 method and our method test set

Read more

Summary

Introduction

With the high-speed development of remote sensing technology, we can obtain more multisource and multicategory remote sensing images (RSIs). ose images are more efficiently employed to observe ground objects. E critical process of RSI scene classification is to extract image features. E existing researches can be mainly divided into two strategies according to the features used: (1) the low-level features and the middle-level global features obtained by the manual extraction and (2) the high-level features extracted automatically by the convolutional neural network (CNN). E middle-level features’ extracting process can be regarded as the highorder statistical of the low-level local features [7,8,9]. Compared with the low-level features, the middle-level features (e.g., bag-of-visual-words model [10]) have expressive content for scene images. Due to the lack of flexibility and adaptability, using the middle-level features cannot distinguish the complex scenes and gain the ideal classification accuracy. The above features are manually extracted, and end-to-end scene classification cannot be achieved. The above features are manually extracted, and end-to-end scene classification cannot be achieved. e final classification accuracy is mainly subject to the human experience

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call