Abstract

This chapter reviews the development of adaptive dynamic programming (ADP). It starts with a background overview of reinforcement learning and dynamic programming. It then moves on to the basic forms of ADP and then to the iterative forms. ADP is an emerging advanced control technology developed for nonlinear dynamical systems. It is based on the idea of approximating dynamic programming solutions. Dynamic programming was introduced by Bellman in the 1950’s for solving optimal control problems of nonlinear dynamical systems. Due to its high computational complexity, applications of dynamic programming have been limited to simple and small problems. The key step in finding approximate solutions to dynamic programming is to estimate the cost function. The optimal control signal can then be determined by minimizing the cost function (or maximizing a reward function). Due to their universal approximation capability, artificial neural networks are often used to represent the cost function in dynamic programming. The implementation of ADP usually requires the use of three modules—critic, model, and action. These three modules perform the function of evaluation, prediction, and decision, respectively.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call