Abstract
This paper studies the asymptotic optimality of discrete-time Markov Decision Processes (MDP's in short) with general state space and action space and having weak and strong interactions. By using a similar approach as developed in [1], the idea in this paper is to consider a MDP with general state and action spaces and to reduce the dimension of the state space by considering an averaged model. This formulation is often described by introducing a small parameter ∈ > 0 in the definition of the transition kernel, leading to a singularly perturbed Markov model with two time scales. First it is shown that the value function of the control problem for the perturbed system converges to the value function of a limit averaged control problem as ∈ goes to zero. In the sequel it is shown that a feedback control policy for the original control problem defined by using an optimal feedback policy for the limit problem is asymptotically optimal.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have