Abstract

It is well known that carefully crafted imperceptible perturbations can cause state-of-the-art deep learning classification models to misclassify. Understanding and analyzing these adversarial perturbations play a crucial role in the design of robust convolutional neural networks. However, their mechanics are not well understood. In this work, we investigate the following question: Is there any relationship between the imperceptibility and semantic significance of adversarial perturbations? In other words, can adversarial perturbations change only the regions of the image that are important for classification, while still being imperceptible? To address this, we consider generating adversarial perturbations by explicitly incorporating a “perceptual quality ball” constraint in our formulation. We pose this adversarial example generation problem as a tractable convex optimization problem, with constraints taken from a mathematically amenable variant of the popular SSIM index. We also provide a fast approximation algorithm to solve the proposed convex optimization problem, bringing our technique in line with other <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"><tex-math notation="LaTeX">$l_{p}$</tex-math></inline-formula> based techniques like PGD, both in form and computational complexity. We use the MobileNetV2 network trained on the ImageNet dataset for our experiments. By comparing the SSIM maps generated by our method with class activation maps, we observe that the perceptually guided perturbations generated by the proposed technique introduce changes specifically in the regions that contribute to classification decisions i.e., the proposed method induces changes that are semantic.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call