Spiking Pitch Black: Poisoning an Unknown Environment to Attack Unknown Reinforcement Learners

Xinghua Qu ,Hang Xu ,Zinovi Rabinovich

doi:10.48448/3z61-qx44

Abstract

**Read paper on the following link:** https://ifaamas.org/Proceedings/aamas2022/pdfs/p1409.pdf **Abstract:** As reinforcement learning (RL) systems are deployed in various safety-critical applications, it is imperative to understand how vulnerable they are to adversarial attacks. Of these, an environment-poisoning attack (EPA) is considered particularly insidious, since environment hyper-parameters are significant factors in determining an RL policy, yet prone to be accessed by third parties. The success of existing EPAs relies on comprehensive prior knowledge of the attacked RL system, including RL agent’s learning mechanism and/or its environment model. Unfortunately, such an assumption of prior knowledge creates an unrealistic attack, one that poses little threat to real-world RL systems. In this paper, we propose a Double-Black-Box EPA framework, only assuming the attacker's ability to alter environment hyper-parameters. Considering that environment alteration comes at a cost, we seek minimal poisoning in an unknown environment, that aims to force a black-box RL agent to learn an attacker-designed policy. To this end, we incorporate an inference module in our framework to capture the internal information of the unknown RL system and, accordingly, learn an adaptive strategy based on an approximation of our attack objective. We empirically show the threat posed by our attack to both tabular-RL and deep-RL algorithms, in both discrete and continuous environments.

Full Text