Abstract
In recent years, artificial intelligence technologies have been widely used in computer vision, natural language processing, automatic driving, and other fields. However, artificial intelligence systems are vulnerable to adversarial attacks, which limit the applications of artificial intelligence (AI) technologies in key security fields. Therefore, improving the robustness of AI systems against adversarial attacks has played an increasingly important role in the further development of AI. This paper aims to comprehensively summarize the latest research progress on adversarial attack and defense technologies in deep learning. According to the target model’s different stages where the adversarial attack occurred, this paper expounds the adversarial attack methods in the training stage and testing stage respectively. Then, we sort out the applications of adversarial attack technologies in computer vision, natural language processing, cyberspace security, and the physical world. Finally, we describe the existing adversarial defense methods respectively in three main categories, i.e., modifying data, modifying models and using auxiliary tools.
Highlights
The applications of artificial intelligence technologies in various fields have been rapidly developed recently
In order to improve the robustness of neural network against the adversarial attack, researchers have proposed a mass of adversarial defense methods, which can be divided into three main categories: modifying data, modifying models and using auxiliary tools
In white-box attacks, the adversaries know the structure and parameters of the target model, it is difficult to achieve in practice, once the adversaries access to the target model, it will pose a great threat to a machine learning model, as an adversary can construct adversarial samples by analyzing the structure of the target model to carry out attacks
Summary
The applications of artificial intelligence technologies in various fields have been rapidly developed recently. In order to improve the robustness of neural network against the adversarial attack, researchers have proposed a mass of adversarial defense methods, which can be divided into three main categories: modifying data, modifying models and using auxiliary tools. Adversarial training is a widely used method of modifying data; Szegedy et al [6] injected adversarial samples and modified their labels to improve the robustness of the target model. Nets (defense-GAN), applicable to both white-box and black-box attacks to reduce the efficiency of adversarial perturbance This method utilizes the power of a generative adversarial network [55]; the main idea is to “project” input images onto the range of the generator G by minimizing the reconstruction error G (z) − x22 , prior to feeding the image x to the classifier.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have