Abstract

Compared to white-box adversarial attacks, black-box adversarial attacks are more applicable in practical scenarios and have received significant attention. However, most existing black-box attacks are optimized at the output layer, and the generated adversarial samples are difficult to transfer from the surrogate model to the target model. Existing adversarial attacks indiscriminately perturb features in intermediate layers, which are prone to fall into local optima of surrogate models and have limited transferability. In this paper, we propose a transferable targeted adversarial attack framework based on a feature-aware triplet, which achieves a better tradeoff between attack ability and transferability by perturbing salient features and constructing representative sample features. The importance-weighted optimization objective interferes with the salient object-aware features of images in a targeted manner and guides adversarial perturbations in the optimization process to pull the intermediate features of samples toward a target class while pushing them far from a source class to achieve targeted transferable attacks. Moreover, a construction method of a feature library with the weighted average of feature importance is built to obtain more expressive intermediate features of the target and source classes. The modified target features and source features are fed into the triplets to guide the optimization objectives to find more transferable adversarial samples. Extensive experiments on the ImageNet-compatible dataset verify the effectiveness of the proposed method, e.g., improving the untargeted success rate by 1.8% and the targeted success rate by 2.3% against normally trained models as compared to the existing methods.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call