Abstract

This paper studies the asymptotic optimality of discrete-time Markov Decision Processes (MDP's in short) with general state space and action space and having weak and strong interactions. By using a similar approach as developed in [1], the idea in this paper is to consider a MDP with general state and action spaces and to reduce the dimension of the state space by considering an averaged model. This formulation is often described by introducing a small parameter ∈ > 0 in the definition of the transition kernel, leading to a singularly perturbed Markov model with two time scales. First it is shown that the value function of the control problem for the perturbed system converges to the value function of a limit averaged control problem as ∈ goes to zero. In the sequel it is shown that a feedback control policy for the original control problem defined by using an optimal feedback policy for the limit problem is asymptotically optimal.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call