Abstract

In recent years, deep neural networks (DNN) have improved expressive performance in many artificial intelligence (AI) fields. However, they can easily induce incorrect behavior due to adversarial examples. Therefore, defense against adversarial examples have started to proliferate, but the shortcomings of the existing defense methods are the following: they are practical only for specific attacks, e.g., Optimization-based; the training of network structures to detect and identify adversarial examples requires enormous computational resources; the processing of adversarial examples introduces other side effects. In this paper, we apply mirror flip to the adversarial examples processed by image filtering algorithms (both linear and nonlinear), to reduce the adversarial perturbations' impact and address the drawbacks mentioned above. Experimental results on ImageNet show that we can substantially improve the effectiveness of resisting adversarial attacks at the expense of partial image sharpness. For example, the accuracy improved from 33.54% to 82.03% for Project Gradient Descent (PGD) L∞ Attack with ε= 1/256 (the accuracy of the training set that we tested is 84.71%). Moreover, our approach can be generalized to most advanced attack algorithms in different adversarial perturbation magnitudes, with better performance in accuracy improvement and image quality evaluation when compared to other input transformation-based defense methods.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call