Approximate Dynamic Programming—II: Algorithms

Warren B Powell

doi:10.1002/9780470400531.eorms0043

Abstract

Abstract Approximate dynamic programming (ADP) is a powerful class of algorithmic strategies for solving stochastic optimization problems, where optimal decisions can be characterized using Bellman's optimality equation, but the characteristics of the problem make solving Bellman's equation computationally intractable. This brief article provides an introduction to the basic concepts of ADP, while building bridges between the different communities that have contributed to this field. We cover basic approximate value iteration (temporal difference learning), policy approximation, and a brief introduction to strategies for approximating value functions. We cover Q ‐learning, and the use of the postdecision state for solving problems with vector‐valued decisions. The approximate linear programming method is introduced along with a discussion of step size selection issues. The presentation closes with a discussion of some practical issues that arise in the implementation of ADP techniques.

Full Text