Abstract

In recent years, artificial intelligence technologies have been widely used in computer vision, natural language processing, automatic driving, and other fields. However, artificial intelligence systems are vulnerable to adversarial attacks, which limit the applications of artificial intelligence (AI) technologies in key security fields. Therefore, improving the robustness of AI systems against adversarial attacks has played an increasingly important role in the further development of AI. This paper aims to comprehensively summarize the latest research progress on adversarial attack and defense technologies in deep learning. According to the target model’s different stages where the adversarial attack occurred, this paper expounds the adversarial attack methods in the training stage and testing stage respectively. Then, we sort out the applications of adversarial attack technologies in computer vision, natural language processing, cyberspace security, and the physical world. Finally, we describe the existing adversarial defense methods respectively in three main categories, i.e., modifying data, modifying models and using auxiliary tools.

Highlights

  • The applications of artificial intelligence technologies in various fields have been rapidly developed recently

  • In order to improve the robustness of neural network against the adversarial attack, researchers have proposed a mass of adversarial defense methods, which can be divided into three main categories: modifying data, modifying models and using auxiliary tools

  • In white-box attacks, the adversaries know the structure and parameters of the target model, it is difficult to achieve in practice, once the adversaries access to the target model, it will pose a great threat to a machine learning model, as an adversary can construct adversarial samples by analyzing the structure of the target model to carry out attacks

Read more

Summary

Introduction

The applications of artificial intelligence technologies in various fields have been rapidly developed recently. In order to improve the robustness of neural network against the adversarial attack, researchers have proposed a mass of adversarial defense methods, which can be divided into three main categories: modifying data, modifying models and using auxiliary tools. Adversarial training is a widely used method of modifying data; Szegedy et al [6] injected adversarial samples and modified their labels to improve the robustness of the target model. Nets (defense-GAN), applicable to both white-box and black-box attacks to reduce the efficiency of adversarial perturbance This method utilizes the power of a generative adversarial network [55]; the main idea is to “project” input images onto the range of the generator G by minimizing the reconstruction error G (z) − x22 , prior to feeding the image x to the classifier.

Adversarial Samples and Adversarial Attack Strategies
Causes of Adversarial Examples
Characteristics of Adversarial Examples
Adversarial Capabilities
Adversarial Goals
Adversarial Attacks
Training Stage Adversarial Attacks
Testing Stage Adversarial Attacks
White-Box Attacks
Black-Box Attacks
Adversarial Attack Applications
Methods
Semantic Image Segmentation and Object Detection
Text Classification
Machine Translation
Cloud Service
Malware Detection
Intrusion Detection
Spoofing Camera
Road Sign Recognition
Machine Vision
Face Recognition
Adversarial Training
Gradient Hiding
Blocking the Transferability
Data Compression
Data Randomization
Regularization
Defensive Distillation
Feature Squeezing
Mask Defense
Parseval Networks
Defense-GAN
MagNet
Findings
Conclusions
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call