Abstract

Many black-box adversarial attack algorithms perform attacks on machine learning models based on the transferability of adversarial examples, and the input transformation-based attack is one of the most effective methods. However, existing input transformation methods ignore that each pixel contributes differently to the output of the model, and the focus regions of different models on the same images are similar. Therefore, this paper proposes a targeted data augmentation-based adversarial attack algorithm named MixCam, which augments the input data based on the contribution of each pixel to the prediction result. This is done to enhance the transferability of the crafted adversarial example from the perspective of shifting the regions to which the model pays most of its attention. In addition, this paper proposes further boosting the transferability of the crafted adversarial examples by fusing the class activation maps of multiple models for the input image. Furthermore, the MixCam can integrate other input transformation methods to further boost the transferability of crafted adversarial examples. Extensive experiments on ImageNet demonstrate that MixCam outperforms other state-of-the-art methods in black-box attacks against considered adversarially trained models, with an average increase of 11.7% and 10.7% in attack success rates for single and ensemble attack settings, respectively.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call