Abstract

Deep neural networks (DNNs) are susceptible to adversarial attacks, and one important factor is that adversarial samples are transferable, i.e., adversarial samples generated by a particular network may deceive other black-box models. However, existing transferable adversarial attacks tend to modify the input features of images directly without selection to reduce the prediction accuracy in the alternative model, which would enable the adversarial samples to fall into the model’s local optimum. Alternative models differ significantly from the victim model in most cases, and while simultaneously attacking multiple models may improve transferability, gathering numerous different models is more challenging and expensive. We simulate various models using frequency domain transformation to close the gap between the source and victim models and improve transferability. At the same time, we destroy important intermediate layer features that influence the decision of the model in the feature space. Additionally, smoothing loss is introduced to remove high-frequency perturbations. Extensive experiments demonstrate that our FM-FSTA attack generates more well-hidden and transferable adversarial samples, and achieves a high deception rate even when attacking adversarially trained models. Compared to other methods, our FM-FSTA improved attack success rate under different defense mechanisms, which reveals the potential threats of current robust models.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.