Abstract

Deep neural networks have achieved remarkable success in machine learning, computer vision, and pattern recognition in the last few decades. Recent studies, however, show that neural networks (both shallow and deep) may be easily fooled by certain imperceptibly perturbed input samples called adversarial examples. Such security vulnerability has resulted in a large body of research in recent years because real-world threats could be introduced due to the vast applications of neural networks. To address the robustness issue to adversarial examples particularly in pattern recognition, robust adversarial training has become one mainstream. Various ideas, methods, and applications have boomed in the field. Yet, a deep understanding of adversarial training including characteristics, interpretations, theories, and connections among different models has remained elusive. This paper presents a comprehensive survey trying to offer a systematic and structured investigation on robust adversarial training in pattern recognition. We start with fundamentals including definition, notations, and properties of adversarial examples. We then introduce a general theoretical framework with gradient regularization for defending against adversarial samples - robust adversarial training with visualizations and interpretations on why adversarial training can lead to model robustness. Connections will also be established between adversarial training and other traditional learning theories. After that, we summarize, review, and discuss various methodologies with defense/training algorithms in a structured way. Finally, we present analysis, outlook, and remarks on adversarial training.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call