Abstract

A policy-iteration-based algorithm is presented in this article for optimal control of unknown continuous-time nonlinear systems subject to bounded inputs by utilizing the adaptive dynamic programming (ADP). Three neural networks (NNs), called critic network, actor network, and quasi-model network, are utilized in the proposed algorithm to give approximations of the control law, the cost function, and the function constituted by partial derivatives of value functions with respect to states and unknown input gain dynamics, respectively. At each iteration, based on the least sum of squares method, the parameters of critic and quasi-model networks will be tuned simultaneously, which eliminates the necessity of separately learning the system model in advance. Then, the control law is improved by satisfying the necessary optimality condition. Then, the proposed algorithm's optimality and convergence properties are exhibited. Finally, the simulation results demonstrate the availability of the proposed algorithm.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.