Enhancing the Transferability of Adversarial Attacks with Input Transformation

Bo Yang,Kaiyong Xu,Hengwei Zhang

doi:10.1088/1742-6596/1955/1/012055

Bo Yang, Kaiyong Xu + Show 1 more

Open Access

https://doi.org/10.1088/1742-6596/1955/1/012055

Copy DOI

Abstract

Deep neural networks (DNNs) are challenged by their vulnerability to adversarial examples, which are crafted by adding small, human-imperceptible perturbations to the original images, but make the model output inaccurate predictions. Adversarial attacks can thus be an important tool to evaluate and select robust models before they are deployed. However, under the challenging black-box setting, most of existing adversarial attacks can only fool a model with a low success rate. Based on image augmentation methods, we found that random transformation of image size can eliminate overfitting in the generation of adversarial examples and improve their transferability. To this end, we propose an adversarial example generation method based on this phenomenon, which can be integrated with Fast Gradient Sign Method (FGSM)-related methods to build a stronger gradient-based attack and generate adversarial examples with better transferability. Extensive experiments on the ImageNet dataset demonstrate that our methods attack both normally trained models and adversarially trained models with higher attack success rates than existing baseline attacks. We hope that our proposed attack method can serve as a benchmark for evaluating the robustness of networks to adversaries and the effectiveness of different defense methods.

Full Text