Abstract
This paper addressed the optimal policy selection problem of attacker and sensor in cyber-physical systems (CPSs) under denial of service (DoS) attacks. Since the sensor and the attacker have opposite goals, a two-player zero-sum game is introduced to describe the game between the sensor and the attacker, and the Nash equilibrium strategies are studied to obtain the optimal actions. In order to effectively evaluate and quantify the gains, a reinforcement learning algorithm is proposed to dynamically adjust the corresponding strategies. Furthermore, security state estimation is introduced to evaluate the impact of offensive and defensive strategies on CPSs. In the algorithm, the ε-greedy policy is improved to make optimal choices based on sufficient learning, achieving a balance of exploration and exploitation. It is worth noting that the channel reliability factor is considered in order to study CPSs with multiple reasons for packet loss. The reinforcement learning algorithm is designed in two scenarios: reliable channel (that is, the reason for packet loss is only DoS attacks) and unreliable channel (the reason for packet loss is not entirely from DoS attacks). The simulation results of the two scenarios show that the proposed reinforcement learning algorithm can quickly converge to the Nash equilibrium policies of both sides, proving the availability and effectiveness of the algorithm.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.