Error Bounds of Adaptive Dynamic Programming Algorithms

Derong Liu,Qinglai Wei,Xiong Yang,Hongliang Li,Ding Wang

doi:10.1007/978-3-319-50815-3_6

Abstract

In this chapter, we first establish error bounds of adaptive dynamic programming (ADP) algorithms for solving undiscounted infinite-horizon optimal control problems of discrete-time deterministic nonlinear dynamical systems. We establish the error bounds for approximate value iteration, approximate policy iteration, and approximate optimistic policy iteration algorithms based on a new error condition. It is shown that the iterative approximate value function converges to a finite neighborhood of the optimal value function under some mild conditions. In addition, we also establish the error bound for Q-function of approximate policy iteration for optimal control of unknown discounted discrete-time nonlinear dynamical systems. We develop an iterative ADP algorithm by using Q-function which depends on the state and action to solve the nonlinear optimal control problems. Function approximation structures such as neural networks are used to approximate the Q-function and the control policy. These results provide theoretical guarantees for using neural network approximation to solve optimal control problems for nonlinear dynamical systems.

Full Text