Abstract

Neural networks provide excellent service on recognition tasks such as image recognition and speech recognition as well as for pattern analysis and other tasks in fields related to artificial intelligence. However, neural networks are vulnerable to adversarial examples. An adversarial example is a sample that is designed to be misclassified by a target model, although it poses no problem for recognition by humans, that is created by applying a minimal perturbation to a legitimate sample. Because the perturbation applied to the legitimate sample to create an adversarial example is optimized, the classification score for the target class has the characteristic of being similar to that for the legitimate class. This regularity occurs because minimal perturbations are applied only until the classification score for the target class is slightly higher than that for the legitimate class. Given the existence of this regularity in the classification scores, it is easy to detect an optimized adversarial example by looking for this pattern. However, the existing methods for generating optimized adversarial examples do not consider their weakness of allowing detectability by recognizing the pattern in the classification scores. To address this weakness, we propose an optimized adversarial example generation method in which the weakness due to the classification score pattern is removed. In the proposed method, a minimal perturbation is applied to a legitimate sample such that the classification score for the legitimate class is less than that for some of the other classes, and an optimized adversarial example is created with the pattern vulnerability removed. The results show that using 500 iterations, the proposed method can generate an optimized adversarial example that has a 100% attack success rate, with distortions of 2.81 and 2.23 for MNIST and Fashion-MNIST, respectively.

Highlights

  • Neural networks [1] exhibit excellent performance on artificial intelligence tasks such as image recognition [2], speech recognition [3], and pattern analysis [4]

  • Adversarial examples are samples created by applying a small perturbation to a legitimate sample such that humans have no problem recognizing it, but it will be misclassified by the target model

  • We report the results of our experiment using MNIST [14] and Fashion-MNIST [15] to evaluate the method’s performance and to verify that the proposed scheme generates an optimized adversarial example from which the classification score pattern vulnerability has been removed

Read more

Summary

INTRODUCTION

Neural networks [1] exhibit excellent performance on artificial intelligence tasks such as image recognition [2], speech recognition [3], and pattern analysis [4]. The basic method for generating adversarial examples is to apply the minimum adversarial perturbation to a legitimate sample that will cause the target model to misclassify it. We propose a method for generating an optimized adversarial example that removes the pattern vulnerability in the classification scores It does this by applying an additional minimal distortion to the legitimate sample so that the classification score of the generated adversarial example for the legitimate class and that for the target class are no longer similar. We report the results of our experiment using MNIST [14] and Fashion-MNIST [15] to evaluate the method’s performance and to verify that the proposed scheme generates an optimized adversarial example from which the classification score pattern vulnerability has been removed.

AND RELATED WORK
DISTORTION
OPTIMIZED ADVERSARIAL EXAMPLES
ASSUMPTION
METHOD
EXPERIMENT AND EVALUATION
DISCUSSION
Findings
Limitations
CONCLUSIONS
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.