Abstract

In deep learning, repeated convolution and pooling processes help to learn image features, but complex nonlinear operations make deep learning models difficult for users to understand. Adversarial example attack is a unique form of attack in deep learning. The attacker attacks the model by applying invisible changes to the picture, affecting the results of the model judgment. In this paper, a research is implemented on the adversarial example attack and neural network interpretability. The neural network interpretability research is believed to have considerable potential in resisting adversarial examples. It helped understand how the adversarial examples induce the neural network to make a wrong judgment and identify adversarial examples in the test set. The corresponding algorithm was designed and the image recognition model was built based on the ImageNet training set. And then the adversarial-example generation algorithm and the neural network visualization algorithm were designed to determine the model learning heat map of the original example and the adversarial-example. The results show that it develops the application of neural network interpretability in the field of resisting adversarial-example attacks.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call