Abstract

This letter, presents an “on-off” learning-based scheme to expand the attacker's surface, namely a moving target defense (MTD) framework, while optimally stabilizing an unknown system. We leverage Q-learning to learn optimal strategies with “on-off” actuation to promote unpredictability of the learned behavior against physically plausible attacks. We provide rigorous, theoretical guarantees on the stability of the equilibrium point even when switching. Finally, we develop two adversarial threat models to evaluate the learning agent's ability to generate robust policies based on a distance to uncontrollability.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call