Finite-Action Approximation of Markov Decision Processes

Naci Saldi,Tamás Linder,Serdar Yüksel

doi:10.1007/978-3-319-79033-6_3

Abstract

In this chapter we study the finite-state approximation problem for computing near optimal policies for discrete-time MDPs with Borel state and action spaces, under discounted and average costs criteria. Even though existence and structural properties of optimal policies of MDPs have been studied extensively in the literature, computing such policies is generally a challenging problem for systems with uncountable state spaces. This situation also arises in the fully observed reduction of a partially observed Markov decision process even when the original system has finite state and action spaces. Here we show that one way to compute approximately optimal solutions for such MDPs is to construct a reduced model with a new transition probability and one-stage cost function by quantizing the state space, i.e., by discretizing it on a finite grid. It is reasonable to expect that when the one-stage cost function and the transition probability of the original model has certain continuity properties, the cost of the optimal policy for the approximating finite model converges to the optimal cost of the original model as the discretization becomes finer. Moreover, under additional continuity conditions on the transition probability and the one stage cost function we also obtain bounds on the accuracy of the approximation in terms of the number of points used to discretize the state space, thereby providing a tradeoff between the computation cost and the performance loss in the system. In particular, we study the following two problems.

Full Text