Adversarial Examples in Deep Neural Networks: An Overview

Emilio Rafael Balda,Arash Behboodi,Rudolf Mathar

doi:10.1007/978-3-030-31760-7_2

Abstract

Deep learning architectures are vulnerable to adversarial perturbations. They are added to the input and alter drastically the output of deep networks. These instances are called adversarial examples. They are observed in various learning tasks from supervised learning to unsupervised and reinforcement learning. In this chapter, we review some of the most important highlights in theory and practice of adversarial examples. The focus is on designing adversarial attacks, theoretical investigation into the nature of adversarial examples, and establishing defenses against adversarial attacks. A common thread in the design of adversarial attacks is the perturbation analysisPerturbation analysis of learning algorithms. Many existing algorithms rely implicitly on perturbation analysisPerturbation analysis for generating adversarial examples. The summary of most powerful attacks are presented in this light. We overview various theories behind the existence of adversarial examples as well as theories that consider the relation between the generalization error and adversarial robustness. Finally, various defenses against adversarial examplesAdversarial examples are also discussed.

Full Text