Abstract

This chapter develops a data-driven implementation of model-based reinforcement learning to solve approximate optimal control problems online under a persistence of excitation-like rank condition. The development is based on the observation that, given a model of the system, reinforcement learning can be implemented by evaluating the Bellman error at any number of desired points in the state-space. In this result, a parametric system model is considered, and a data-driven parameter identifier is developed to compensate for uncertainty in the parameters. Uniformly ultimately bounded regulation of the system states to a neighborhood of the origin, and convergence of the developed policy to a neighborhood of the optimal policy are established using a Lyapunov-based analysis. Simulation results indicate that the developed controller can be implemented to achieve fast online learning without the addition of ad-hoc probing signals as in Chap. 3. The developed model-based reinforcement learning method is extended to solve trajectory tracking problems for uncertain nonlinear systems, and to generate approximate feedback-Nash equilibrium solutions to N-player nonzero-sum differential games.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call