A new method to detect the adversarial attack based on the residual image

Feng Sun ,Yi-Chih Kao ,Zhenjiang Zhang ,Tianzhou Li ,Bo Shen

doi:10.3966/160792642019072004028

Abstract

Nowadays, with the development of artificial intelligence, deep learning has attracted more and more attention. Whereas deep neural network has made incredible progress in many domains including Computer Vision, Nature Language Processing, etc, recent studies show that they are vulnerable to the adversarial attacks which takes legitimate images with undetected perturbation as input and can mislead the model to predict incorrect outputs. We consider that the key point of the adversarial attack is the undetected perturbation added to the input. It will be of great significance to eliminate the effect of the added noise. Thus, we design a new, efficient model based on residual image which can detect this potential adversarial attack. We design a method to get the residual image which can capture these possible perturbations. Based on the residual image we got, the detection mechanism can help us detect whether it is an adversarial image or not. A serial of experiments has also been carried out. Subsequent experiments prove that the new detection method can detect the adversarial attack with high effectivity.

Full Text