Abstract

In this study, a novel online adaptive dynamic programming (ADP)-based algorithm is developed for solving the optimal control problem of affine non-linear continuous-time systems with unknown internal dynamics. The present algorithm employs an observer-critic architecture to approximate the Hamilton-Jacobi-Bellman equation. Two neural networks (NNs) are used in this architecture: an NN state observer is constructed to estimate the unknown system dynamics and a critic NN is designed to derive the optimal control instead of typical action-critic dual networks employed in traditional ADP algorithms. Based on the developed architecture, the observer NN and the critic NN are tuned simultaneously. Meanwhile, unlike existing tuning laws for the critic, the newly developed critic update rule not only ensures convergence of the critic to the optimal control but also guarantees stability of the closed-loop system. No initial stabilising control is required, and by using recorded and instantaneous data simultaneously for the adaptation of the critic, the restrictive persistence of excitation condition is relaxed. In addition, Lyapunov direct method is utilised to demonstrate the uniform ultimate boundedness of the weights of the observer NN and the critic NN. Finally, an example is provided to verify the effectiveness of the present approach.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.