Abstract

In this chapter, we first establish error bounds of adaptive dynamic programming (ADP) algorithms for solving undiscounted infinite-horizon optimal control problems of discrete-time deterministic nonlinear dynamical systems. We establish the error bounds for approximate value iteration, approximate policy iteration, and approximate optimistic policy iteration algorithms based on a new error condition. It is shown that the iterative approximate value function converges to a finite neighborhood of the optimal value function under some mild conditions. In addition, we also establish the error bound for Q-function of approximate policy iteration for optimal control of unknown discounted discrete-time nonlinear dynamical systems. We develop an iterative ADP algorithm by using Q-function which depends on the state and action to solve the nonlinear optimal control problems. Function approximation structures such as neural networks are used to approximate the Q-function and the control policy. These results provide theoretical guarantees for using neural network approximation to solve optimal control problems for nonlinear dynamical systems.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call