Abstract

In this chapter , generalized policy iteration (GPI) algorithms are developed to solve infinite-horizon optimal control problems for discrete-time nonlinear systems. GPI algorithms use the idea of interacting policy iteration and value iteration algorithms of adaptive dynamic programming (ADP). They permit an arbitrary positive semidefinite function to initialize the algorithm, where two revolving iterations are used for policy evaluation and policy improvement, respectively. Then, the monotonicity, convergence, admissibility, and optimality properties of the present GPI algorithms for discrete-time nonlinear systems are analyzed. For implementation of the GPI algorithms, neural networks are employed for approximating the iterative value functions and computing the iterative control laws, respectively, to obtain the approximate optimal control law. Simulation examples are included to verify the effectiveness of the present algorithm.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call