Abstract

Convolutional neural networks have outperformed humans in image recognition tasks, but they remain vulnerable to attacks from adversarial examples. Since these data are crafted by adding imperceptible noise to normal images, their existence poses potential security threats to deep learning systems. Sophisticated adversarial examples with strong attack performance can also be used as a tool to evaluate the robustness of a model. However, the success rate of adversarial attacks can be further improved in black-box environments. Therefore, this study combines a modified Adam gradient descent algorithm with the iterative gradient-based attack method. The proposed Adam iterative fast gradient method is then used to improve the transferability of adversarial examples. Extensive experiments on ImageNet showed that the proposed method offers a higher attack success rate than existing iterative methods. By extending our method, we achieved a state-of-the-art attack success rate of 95.0% on defense models.

Highlights

  • In image recognition tasks, convolutional neural networks are able to classify images with an accuracy approaching that of humans [1,2,3,4]

  • Szegedy et al [5] first proposed the concept of adversarial examples: images added with small perturbations, which cause neural network models to output incorrect classifications with high confidence. ese adversarial perturbations are often indistinguishable to the human eyes

  • Adam Iterative Fast Gradient Method. e generation of adversarial examples is similar to the training of neural networks; both processes can be viewed as an optimization problem

Read more

Summary

Introduction

Convolutional neural networks are able to classify images with an accuracy approaching that of humans [1,2,3,4]. Szegedy et al [5] first proposed the concept of adversarial examples: images added with small perturbations, which cause neural network models to output incorrect classifications with high confidence. Ese adversarial perturbations are often indistinguishable to the human eyes (in other words, there is no obvious visual difference between the adversarial examples and the original images). A variety of techniques can be used to generate adversarial examples and perform white-box attacks, depending on the model structure and corresponding parameters [6,7,8,9,10]. Is facilitates black-box attacks, in which the structure and parameters of the model are not available, for various neural networks [11]. Ese properties make it easier to generalize adversarial examples to different models Goodfellow et al [6] suggested that different models learn similar decision boundaries during the same image classification tasks and obtain similar parameters. ese properties make it easier to generalize adversarial examples to different models

Objectives
Methods
Findings
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.