Abstract
In this paper, we propose a data-driven adaptive dynamic programming approach to solve the Hamilton-Jacobi (HJ) equations for the two-player nonzero-sum (NZS) game with completely unknown dynamics. First, the model-based policy iteration (PI) algorithm is given, where the knowledge of system dynamics is required. To relax this requirement, a data-driven adaptive dynamic programming (ADP) is proposed in this paper to solve the unknown nonlinear NZS game with only online data. Neural network approximators are constructed to approach the solution of the HJ equations. The online data is collected under the two initial admissible control policies. Then, the NN weights are updated based on the least-squares method using the collected online data repeatedly, which is a kind of the off-policy learning scheme. Finally, a simulation example is provided to demonstrate the effectiveness of the proposed control scheme.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.