Abstract

A supervisory control approach using hierarchical reinforcement learning (HRL) is developed to approximate the solution to optimal regulation problems for a control-affine, continuous-time nonlinear system with unknown drift dynamics. This result contains two objectives. The first objective is to approximate the optimal control policy that minimizes the infinite horizon cost function of each approximate dynamic programming (ADP) sub-controller. The second objective is to design a switching rule, by comparing the approximated optimal value functions of the ADP sub-controllers, to ensure that switching between subsystems yields a lower cost than using one subsystem. An integral concurrent learning-based parameter identifier approximates the unknown drift dynamics. Uniformly ultimately bounded regulation of the system states to a neighborhood of the origin, and convergence of the approximate control policy to a neighborhood of the optimal control policy, are proven using a Lyapunov-based stability and dwell-time analysis.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call