Abstract

Deep neural networks (DNNs) provide excellent performance in image recognition, speech recognition, video recognition, and pattern analysis. However, they are vulnerable to adversarial example attacks. An adversarial example, which is input to which a little bit of noise has been strategically added, appears normal to the human eye but will be misrecognized by the DNN. In this paper, we propose AdvGuard, a method for resisting adversarial example attacks. This defense method prevents the generation of adversarial examples by constructing a robust DNN that provides random confidence values. This method does not require training of adversarial examples, use of other processing modules, or the ability to perform input data filtering. In addition, a DNN constructed using the proposed scheme can defend against adversarial examples while maintaining its accuracy on the original samples. In the experimental evaluation, MNIST and CIFAR10 were used as datasets, and TensorFlow was used as a machine learning library. The results show that a DNN constructed using the proposed method can correctly classify adversarial examples with 100% and 99.5% accuracy on MNIST and CIFAR10, respectively.

Highlights

  • Deep neural networks (DNNs) [1] display good performance in recognition domains such as image recognition [2], speech recognition [3], intrusion detection [4], and pattern recognition [5]

  • The optimized adversarial example should satisfy the aim of being misrecognized by the model while having minimal distortion from the original sample; this is accomplished through feedback in the form of the confidence values delivered by the target model

  • EXPERIMENTAL RESULTS Attack success rate refers to the proportion of samples for which the class recognized by the target classifier matches that intended by the attacker for the adversarial example

Read more

Summary

Introduction

Deep neural networks (DNNs) [1] display good performance in recognition domains such as image recognition [2], speech recognition [3], intrusion detection [4], and pattern recognition [5]. An adversarial example is an input sample to which a little noise has been added, noise that is invisible to humans but is designed to induce misrecognition by the DNN. To generate an optimized adversarial example, an attacker should know the confidence values calculated by the target model, which is accomplished by accessing the softmax layer. The optimized adversarial example should satisfy the aim of being misrecognized by the model while having minimal distortion from the original sample; this is accomplished through feedback in the form of the confidence values delivered by the target model.

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call