Abstract

This paper considers the problem of identifying the cognitive capabilities of agents having bounded rationality in a cyber–physical system operating in an uncertain and/or adversarial environment. To categorize the adversaries, we introduce an iterative method of optimal responses that determine the policy of an agent with a level-k intelligence. We then, formulate a model-free learning algorithm to train the different intelligence levels without any knowledge about the physics of the system while quantifying the optimality. By sequential interaction with the adversaries, we learn the true distribution of their levels. Rigorous mathematical proofs show stability of the equilibrium point of the closed-loop system and convergence to the Nash equilibrium as the level of intelligence tends to infinity. Finally, simulation results show the efficacy of our proposed approach.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.