State of the Art of Adaptive Dynamic Programming and Reinforcement Learning

Derong Liu,Shan Xue,Mingming Ha

doi:10.26599/air.2022.9150007

Abstract

This article introduces the state-of-the-art development of adaptive dynamic programming and reinforcement learning (ADPRL). First, algorithms in reinforcement learning (RL) are introduced and their roots in dynamic programming are illustrated. Adaptive dynamic programming (ADP) is then introduced following a brief discussion of dynamic programming. Researchers in ADP and RL have enjoyed the fast developments of the past decade from algorithms, to convergence and optimality analyses, and to stability results. Several key steps in the recent theoretical developments of ADPRL are mentioned with some future perspectives. In particular, convergence and optimality results of value iteration and policy iteration are reviewed, followed by an introduction to the most recent results on stability analysis of value iteration algorithms.

Full Text