Abstract

This paper proposes a reinforcement learning (RL) algorithm for the security problem of state estimation of cyber-physical system (CPS) under denial-of-service (DoS) attacks. The security of CPS will inevitably decline when faced with malicious cyber attacks. In order to analyze the impact of cyber attacks on CPS performance, a Kalman filter, as an adaptive state estimation technology, is combined with an RL method to evaluate the issue of system security, where estimation performance is adopted as an evaluation criterion. Then, the transition of estimation error covariance under a DoS attack is described as a Markov decision process, and the RL algorithm could be applied to resolve the optimal countermeasures. Meanwhile, the interactive combat between defender and attacker could be regarded as a two-player zero-sum game, where the Nash equilibrium policy exists but needs to be solved. Considering the energy constraints, the action selection of both sides will be restricted by setting certain cost functions. The proposed RL approach is designed from three different perspectives, including the defender, the attacker and the interactive game of two opposite sides. In addition, the framework of Q-learning and state–action–reward–state–action (SARSA) methods are investigated separately in this paper to analyze the influence of different RL algorithms. The results show that both algorithms obtain the corresponding optimal policy and the Nash equilibrium policy of the zero-sum interactive game. Through comparative analysis of two algorithms, it is verified that the differences between Q-Learning and SARSA could be applied effectively into the secure state estimation in CPS.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call