Classification score approach for detecting adversarial example in deep neural network

Hyun Kwon,Hyunsoo Yoon,Yongchul Kim,Daeseon Choi

doi:10.1007/s11042-020-09167-z

Abstract

Deep neural networks (DNNs) provide superior performance on machine learning tasks such as image recognition, speech recognition, pattern analysis, and intrusion detection. However, an adversarial example, created by adding a little noise to an original sample, can cause misclassification by a DNN. This is a serious threat to the DNN because the added noise is not detected by the human eye. For example, if an attacker modifies a right-turn sign so that it misleads to the left, autonomous vehicles with the DNN will incorrectly classify the modified sign as pointing to the left, but a person will correctly classify the modified sign as pointing to the right. Studies are under way to defend against such adversarial examples. The existing method of defense against adversarial examples requires an additional process such as changing the classifier or modifying input data. In this paper, we propose a new method for detecting adversarial examples that does not invoke any additional process. The proposed scheme can detect adversarial examples by using a pattern feature of the classification scores of adversarial examples. We used MNIST and CIFAR10 as experimental datasets and Tensorflow as a machine learning library. The experimental results show that the proposed method can detect adversarial examples with success rates: 99.05% and 99.9% for the untargeted and targeted cases in MNIST, respectively, and 94.7% and 95.8% for the untargeted and targeted cases in CIFAR10, respectively.

Highlights

Deep neural networks (DNNs) [26] provide excellent service on machine learning tasks such as image recognition [28], speech recognition [10, 11], pattern analysis [4], and intrusion detection [24]
DNNs are vulnerable to adversarial examples [29, 33], in which a little noise has been added to an original sample
The threshold is the value applied as a judgment criterion: If the classification score difference is less than threshold, the input is identified as an adversarial example, and if it is greater than the threshold, the input is determined to be an original sample

Summary

Introduction

Deep neural networks (DNNs) [26] provide excellent service on machine learning tasks such as image recognition [28], speech recognition [10, 11], pattern analysis [4], and intrusion detection [24]. If an attacker generates a modified left-turn sign so that it will be incorrectly categorized by a DNN, the autonomous vehicle with the DNN will incorrectly classify the modified sign as pointing to the right, but a person will correctly classify the modified sign as pointing to the left Because such adversarial examples are serious threats to a DNN, several defenses against adversarial examples are being studied. Adversarial examples can be detected using patterns characteristic of the classification scores without the need to change a classifier or modify input data. We propose a defense system for detecting adversarial examples that does not require modifications to the classifier or the input data, making use of this pattern feature of classification scores.

Methods

Results

Discussion

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Multimedia Tools and Applications	Publication Date: Nov 21, 2020
Citations: 22	License type: open-access

R Discovery Prime

R Discovery Prime

Classification score approach for detecting adversarial example in deep neural network

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Multimedia Tools and Applications

Lead the way for us

Similar Papers

Fooling a Neural Network in Military Environments: Random Untargeted Adversarial Example
Hyun Kwon ... Yongchul Kim
-
Hyun Kwon, et. al.Hyun Kwon ... Yongchul Kim
01 Oct 2018
01 Oct 2018

Priority Adversarial Example in Evasion Attack on Multiple Deep Neural Networks
Hyun Kwon ... Hyunsoo Yoon
-
Hyun Kwon, et. al.Hyun Kwon ... Hyunsoo Yoon
01 Feb 2019
01 Feb 2019

Priority Evasion Attack: An Adversarial Example That Considers the Priority of Attack on Each Classifier
Hyun Kwon ... Changhyun Cho
IEICE Transactions on Information and Systems | VOL. E105.D
Hyun Kwon, et. al.Hyun Kwon ... Changhyun Cho
01 Nov 2022
IEICE Transactions on Information and Systems | VOL. E105.D

Random Untargeted Adversarial Example on Deep Neural Network
Hyun Kwon ... Daeseon Choi
Symmetry | VOL. 10
Hyun Kwon, et. al.Hyun Kwon ... Daeseon Choi
10 Dec 2018
Symmetry | VOL. 10

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Classification score approach for detecting adversarial example in deep neural network

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Multimedia Tools and Applications