Abstract

Introduction Decisions must often be made in a sequential manner over time. Earlier decisions may affect the feasibility and performance of later decisions. In such environments, myopic decisions that optimize only the immediate impact are usually suboptimal for the overall process. To find optimal strategies one must consider current and future decisions simultaneously. These types of multi-stage decision problems are the typical settings where one employs dynamic programming , or DP. Dynamic programming is a term used both for the modeling methodology and the solution approaches developed to solve sequential decision problems. In some cases the sequential nature of the decision process is obvious and natural, in other cases one reinterprets the original problem as a sequential decision problem. We will consider examples of both types below. Dynamic programming models and methods are based on Bellman's Principle of Optimality , namely that for overall optimality in a sequential decision process, all the remaining decisions after reaching a particular state must be optimal with respect to that state. In other words, if a strategy for a sequential decision problem makes a sub-optimal decision in any one of the intermediate stages, it cannot be optimal for the overall problem. This principle allows one to formulate recursive relationships between the optimal strategies of successive decision stages and these relationships form the backbone of DP algorithms.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call