On-Off Adversarially Robust Q-Learning

Prachi Pratyusha Sahoo,Kyriakos G Vamvoudakis

doi:10.1109/lcsys.2020.2979572

Prachi Pratyusha Sahoo, Kyriakos G Vamvoudakis

Open Access

https://doi.org/10.1109/lcsys.2020.2979572

Copy DOI

Journal: IEEE control systems letters	Publication Date: Jul 1, 2020
Citations: 23	License type: publisher-specific, author manuscript

Affiliation: Georgia Institute of Technology

Abstract

This letter, presents an “on-off” learning-based scheme to expand the attacker's surface, namely a moving target defense (MTD) framework, while optimally stabilizing an unknown system. We leverage Q-learning to learn optimal strategies with “on-off” actuation to promote unpredictability of the learned behavior against physically plausible attacks. We provide rigorous, theoretical guarantees on the stability of the equilibrium point even when switching. Finally, we develop two adversarial threat models to evaluate the learning agent's ability to generate robust policies based on a distance to uncontrollability.

Full Text