A fully supervised universal adversarial perturbations and the progressive optimization

Guangling Sun,Xiaofeng Lu,Haoqi Hu,Xinpeng Zhang

doi:10.3233/jifs-210728

Abstract

Universal Adversarial Perturbations(UAPs), which are image-agnostic adversarial perturbations, have been demonstrated to successfully deceive computer vision models. Proposed UAPs in the case of data-dependent, use the internal layers’ activation or the output layer’s decision values as supervision. In this paper, we use both of them to drive the supervised learning of UAP, termed as fully supervised UAP(FS-UAP), and design a progressive optimization strategy to solve the FS-UAP. Specifically, we define an internal layers supervised objective relying on multiple major internal layers’ activation to estimate the deviations of adversarial examples from legitimate examples. We also define an output layer supervised objective relying on the logits of output layer to evaluate attacking degrees. In addition, we use the UAP found by previous stage as the initial solution of the next stage so as to progressively optimize the UAP stage-wise. We use seven networks and ImageNet dataset to evaluate the proposed FS-UAP, and provide an in-depth analysis for the latent factors affecting the performance of universal attacks. The experimental results show that our FS-UAP (i) has powerful capability of cheating CNNs (ii) has superior transfer-ability across models and weak data-dependent (iii) is appropriate for both untarget and target attacks.

Full Text