Abstract

Online critic learning or solving robust control problems of complex systems usually requires knowledge about system dynamics. In order to achieve these goals in data-driven method, a new performance index related to the decreasing rate of the conventional cost is designed. The corresponding optimal control policy can be approximated online using a new actor–critic scheme with three neural networks, without depending on initial stable control and knowledge about system dynamics. The learning process and the learned control policy show excellent robustness. Numerical simulations and an inverted pendulum experiment show that compared with benchmark methods, the proposed method relaxes the dependence on initial admissible control and exhibits better disturbance attenuation performance.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call