Sparse adversarial image generation using dictionary learning

Maham Jahangir,Faisal Shafait

doi:10.1117/1.jei.32.3.033006

Abstract

Adversarial examples are used to evaluate the robustness of convolutional neural networks (CNNs) to input perturbations. Researchers have proposed different types of adversarial examples that attack CNNs to fool them. These attacks pose a serious threat to applications that use deep neural networks. Existing methods for adversarial image generation struggle in maintaining a balance between attack success rate and imperceptibility (measured in terms of ℓ2-norm) of the generated adversarial examples. Recent sparse methods for this problem focus on limiting the number of pixels in an image but do not cater to the overall imperceptibility of the adversarial images. To address these problems, we introduce adversarial attacks based on K-singular value decomposition sparse dictionary learning. The dictionary is learned using feature maps of the targeted images from the first layer of CNN. The proposed method is evaluated in terms of attack success rate and ℓ2-norm. The extensive experimentation shows our attack achieves a high success rate while maintaining a low imperceptibility score compared to state-of-the-art methods.

Full Text