Singular control for discounted Markov Decision Processes in a general state space

O.L.V Costa,F Dufour

doi:10.1109/cdc.2011.6160377

Abstract

This paper studies the asymptotic optimality of discrete-time Markov Decision Processes (MDP's in short) with general state space and action space and having weak and strong interactions. By using a similar approach as developed in [1], the idea in this paper is to consider a MDP with general state and action spaces and to reduce the dimension of the state space by considering an averaged model. This formulation is often described by introducing a small parameter ∈ > 0 in the definition of the transition kernel, leading to a singularly perturbed Markov model with two time scales. First it is shown that the value function of the control problem for the perturbed system converges to the value function of a limit averaged control problem as ∈ goes to zero. In the sequel it is shown that a feedback control policy for the original control problem defined by using an optimal feedback policy for the limit problem is asymptotically optimal.

Full Text