Abstract

In this paper, we propose a data-driven adaptive dynamic programming approach to solve the Hamilton-Jacobi (HJ) equations for the two-player nonzero-sum (NZS) game with completely unknown dynamics. First, the model-based policy iteration (PI) algorithm is given, where the knowledge of system dynamics is required. To relax this requirement, a data-driven adaptive dynamic programming (ADP) is proposed in this paper to solve the unknown nonlinear NZS game with only online data. Neural network approximators are constructed to approach the solution of the HJ equations. The online data is collected under the two initial admissible control policies. Then, the NN weights are updated based on the least-squares method using the collected online data repeatedly, which is a kind of the off-policy learning scheme. Finally, a simulation example is provided to demonstrate the effectiveness of the proposed control scheme.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.