Abstract

This study establishes an approximate optimal critic learning algorithm based on single-network adaptive dynamic programming aiming at solutions to continuous-time two-player zero-sum games in the absence of initial stabilising control policies. Single-network means one critic neural network, which is utilised to derive the saddle-point equilibrium of a zero-sum differential game by approximately learning the value function. First, the authors elaborate mathematically two-player zero-sum game problems and analyse the similarity of the zero-sum game problem between linear and non-linear systems. Then, this adaptive learning scheme is implemented as a critic structure that derives control and disturbance policies by learning the optimal value, and a novel weight tuning law involving a stable operator is proposed to ensure convergence and stability. Moreover, the uniform ultimate bounded stability of the whole system is rigorously proved by Lyapunov theory. Finally, reasonable simulation results are provided to confirm the effectiveness of the improved approximate optimal control technique in solving equations for a complex linear system and a non-linear system.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.