Abstract

Deep neural networks (DNNs) have various applications owing to their feature learning ability. However, recent studies have shown that DNNs are vulnerable to adversarial examples. Currently, research on the generation of adversarial examples primarily focuses on improving the attack success rate (ASR) while reducing the perturbation size. By visualizing of heat maps, previous works have found that the feature extraction effect of DNNs is owing to the precise location of object contours and the provision of the correct attention to those areas. Therefore, the perturbations in adversarial examples will weaken the location of object contours in deep hidden layers and reduce the attention scope of the object area, which will lead to successful attacks. Inspired by this observation, we propose FineFool, a novel adversarial attack based on the attention perturbation adversarial technique, which includes channel-spatial attention and pixel-spatial attention. The former reduces the area of concern using DNNs while the latter achieves the error location of the object contours. By using the attention perturbation adversarial technique to target positions that are more vulnerable in legitimate examples, FineFool achieves a higher ASR with fewer perturbations compared with that of state-of-the-art adversarial attacks. Extensive experiments are carried out on MNIST, CIFAR10, and ImageNet datasets against six models. The results show that FineFool can achieve the best performance compared with the six baselines. More specifically, the mean ASR values of untargeted/targeted attack are 99.23% and 98.26% for FineFool on all datasets, respectively, which is the highest under white-box attack situations. The code of FineFool is open sourced at https://zenodo.org/record/4421611#.X.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call