Abstract
Deep neural networks (DNNs) provide excellent performance in image recognition, speech recognition, video recognition, and pattern analysis. However, they are vulnerable to adversarial example attacks. An adversarial example, which is input to which a little bit of noise has been strategically added, appears normal to the human eye but will be misrecognized by the DNN. In this paper, we propose AdvGuard, a method for resisting adversarial example attacks. This defense method prevents the generation of adversarial examples by constructing a robust DNN that provides random confidence values. This method does not require training of adversarial examples, use of other processing modules, or the ability to perform input data filtering. In addition, a DNN constructed using the proposed scheme can defend against adversarial examples while maintaining its accuracy on the original samples. In the experimental evaluation, MNIST and CIFAR10 were used as datasets, and TensorFlow was used as a machine learning library. The results show that a DNN constructed using the proposed method can correctly classify adversarial examples with 100% and 99.5% accuracy on MNIST and CIFAR10, respectively.
Highlights
Deep neural networks (DNNs) [1] display good performance in recognition domains such as image recognition [2], speech recognition [3], intrusion detection [4], and pattern recognition [5]
The optimized adversarial example should satisfy the aim of being misrecognized by the model while having minimal distortion from the original sample; this is accomplished through feedback in the form of the confidence values delivered by the target model
EXPERIMENTAL RESULTS Attack success rate refers to the proportion of samples for which the class recognized by the target classifier matches that intended by the attacker for the adversarial example
Summary
Deep neural networks (DNNs) [1] display good performance in recognition domains such as image recognition [2], speech recognition [3], intrusion detection [4], and pattern recognition [5]. An adversarial example is an input sample to which a little noise has been added, noise that is invisible to humans but is designed to induce misrecognition by the DNN. To generate an optimized adversarial example, an attacker should know the confidence values calculated by the target model, which is accomplished by accessing the softmax layer. The optimized adversarial example should satisfy the aim of being misrecognized by the model while having minimal distortion from the original sample; this is accomplished through feedback in the form of the confidence values delivered by the target model.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.