Abstract

This article presents a novel efficient experience-replay-based adaptive dynamic programming (ADP) for the optimal control problem of a class of nonlinear dynamical systems within the Hamiltonian-driven framework. The quasi-Hamiltonian is presented for the policy evaluation problem with an admissible policy. With the quasi-Hamiltonian, a novel composite critic learning mechanism is developed to combine the instantaneous data with the historical data. In addition, the pseudo-Hamiltonian is defined to deal with the performance optimization problem. Based on the pseudo-Hamiltonian, the conventional Hamilton-Jacobi-Bellman (HJB) equation can be represented in a filtered form, which can be implemented online. Theoretical analysis is investigated in terms of the convergence of the adaptive critic design and the stability of the closed-loop systems, where parameter convergence can be achieved under a weakened excitation condition. Simulation studies are investigated to verify the efficacy of the presented design scheme.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.