Abstract

Reinforcement learning is a core technology for modern artificial intelligence, and it has become a workhorse for AI applications ranging from Atrai Game to Connected and Automated Vehicle System (CAV). Therefore, a reliable RL system is the foundation for the security critical applications in AI, which has attracted a concern that is more critical than ever. However, recent studies discover that the interesting attack mode adversarial attack also be effective when targeting neural network policies in the context of reinforcement learning, which has inspired innovative researches in this direction. Hence, in this paper, we give the very first attempt to conduct a comprehensive survey on adversarial attacks in reinforcement learning under AI security. Moreover, we give briefly introduction on the most representative defense technologies against existing adversarial attacks.

Highlights

  • Artificial intelligence (AI) is providing major breakthroughs in solving the problems that have withstood many attempts of natural language understanding, speech recognition, image understanding and so on

  • The Policy induction attack (PIA) (Behzadan and Munir 2017), Strategically-Timed Attack (STA) (Lin et al 2017), Enchanting Attack (EA) (Lin et al 2017), and attack on VIN (AVI) (Liu et al 2017) are Black-box attacks, in which adversary has no idea of the details related to training algorithm and corresponding parameters of the model, for the threat model discussed in these literatures, authors assumed that the adversary has access to the training environment bat has no idea of the random initializations of the target policy, and does not know what the learning algorithm is

  • In this paper, we give the very first attempt to conduct a comprehensive survey on adversarial attacks in the context of reinforcement learning under artificial intelligence (AI) security

Read more

Summary

Introduction

Artificial intelligence (AI) is providing major breakthroughs in solving the problems that have withstood many attempts of natural language understanding, speech recognition, image understanding and so on. White-box attack Fast gradient sign method (FGSM) Huang et al (2017) first showed that adversarial attacks are effective when targeting neural network policies in reinforcement learning system. Start point-based adversarial attack on Q-learning (SPA) Xiang et al (2018) focused on the adversarial examplebased attack on a representative reinforcement learning named Q-learning in automatic path finding They proposed a probabilistic output model based on the influence factors and the corresponding weights to predict the adversarial examples under such scenario. White-box based adversarial attack on DQN (WBA) Based on the SPA algorithm introduced above, Bai et al (2018) proposed that they first use DQN to find the optimal path, and analyzed the rules of DQN pathfinding They proposed a method that can effectively find vulnerable points towards White-Box Q-table variation in DQN pathfinding training. In order to calculate the Gradient Band more accurately, authors considered two kinds of situations according to the difference for original map and gradient function, one situation is that obstacles exist on both sides of the gradient function, and the other is that obstacles exist on one side if the gradient function

Case 1
Case 2
Method
Findings
Conclusion and discussion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.