Model-based reinforcement learning for approximate optimal regulation

Rushikesh Kamalapurkar,Patrick Walters,Warren E Dixon

doi:10.1016/j.automatica.2015.10.039

Rushikesh Kamalapurkar, Patrick Walters + Show 1 more

Open Access

https://doi.org/10.1016/j.automatica.2015.10.039

Copy DOI

Journal: Automatica	Publication Date: Dec 7, 2015
Citations: 183	License type: publisher-specific-oa

Affiliation: University of Florida

Abstract

Reinforcement learning (RL)-based online approximate optimal control methods applied to deterministic systems typically require a restrictive persistence of excitation (PE) condition for convergence. This paper develops a concurrent learning (CL)-based implementation of model-based RL to solve approximate optimal regulation problems online under a PE-like rank condition. The development is based on the observation that, given a model of the system, RL can be implemented by evaluating the Bellman error at any number of desired points in the state space. In this result, a parametric system model is considered, and a CL-based parameter identifier is developed to compensate for uncertainty in the parameters. Uniformly ultimately bounded regulation of the system states to a neighborhood of the origin, and convergence of the developed policy to a neighborhood of the optimal policy are established using a Lyapunov-based analysis, and simulation results are presented to demonstrate the performance of the developed controller.

Full Text