Enhance Domain-Invariant Transferability of Adversarial Examples via Distance Metric Attack

Jin Zhang,Ruxin Wang,Ge Lan,Wenyu Peng,Wei Zhou,Yu Lin

doi:10.3390/math10081249

Jin Zhang, Ruxin Wang + Show 4 more

Open Access

https://doi.org/10.3390/math10081249

Copy DOI

Journal: Mathematics	Publication Date: Apr 11, 2022
Citations: 3	License type: CC BY 4.0

Affiliation: Institute of Physics, Yunnan University

Abstract

A general foundation of fooling a neural network without knowing the details (i.e., black-box attack) is the attack transferability of adversarial examples across different models. Many works have been devoted to enhancing the task-specific transferability of adversarial examples, whereas the cross-task transferability is nearly out of the research scope. In this paper, to enhance the above two types of transferability of adversarial examples, we are the first to regard the transferability issue as a heterogeneous domain generalisation problem, which can be addressed by a general pipeline based on the domain-invariant feature extractor pre-trained on ImageNet. Specifically, we propose a distance metric attack (DMA) method that aims to increase the latent layer distance between the adversarial example and the benign example along the opposite direction guided by the cross-entropy loss. With the help of a simple loss, DMA can effectively enhance the domain-invariant transferability (for both the task-specific case and the cross-task case) of the adversarial examples. Additionally, DMA can be used to measure the robustness of the latent layers in a deep model. We empirically find that the models with similar structures have consistent robustness at depth-similar layers, which reveals that model robustness is closely related to model structure. Extensive experiments on image classification, object detection, and semantic segmentation demonstrate that DMA can improve the success rate of black-box attack by more than 10% on the task-specific attack and by more than 5% on cross-task attack.

Full Text