Semi-infinite weighted Markov decision processes with perturbation

Mohammed Abbad,Khalid Rahhali

doi:10.1007/s001860400363

Abstract

In this paper, Weighted reward Perturbed Markov Decision Processes with finite state and countable action spaces (semi-infinite WMDP for short) are considered. The ”weighted reward” refers to appropriately normalized convex combination of the discounted and the long-run average reward criteria. This criterion allows the controller to trade-off short-term rewards versus long-run rewards. In every application where both the discounted and the long-run average criteria have been proposed in the past, there is clearly a rationale for considering the weighted criterion. Of course, as with all Markov decision models, the standard weighted criterion model assumes that all the transition probabilities are known precisely. Since, in most applications this would not be the case, we consider the perturbed version of the weighted reward model. In the case of perturbations, we prove that for many models a nearly optimal strategy can be found in the class of relatively “simple ultimately deterministic” strategies. These are strategies which behave just like deterministic stationary strategies, after a certain point of time.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Semi-infinite weighted Markov decision processes with perturbation

Abstract

Talk to us

Similar Papers

More From: Mathematical Methods of Operational Research

Lead the way for us

Journal: Mathematical Methods of Operational Research	Publication Date: Oct 1, 2004
Citations: 6

Similar Papers

Semi-Infinite Weighted Markov Decision Processes
Mohammed Abbad ... Khalid Rahhali
Stochastic Models | VOL. 24
Mohammed Abbad, et. al.Mohammed Abbad ... Khalid Rahhali
31 Oct 2008
Stochastic Models | VOL. 24

Duality in Markov Decision Problems with Countable Action and State Spaces
John P Evans
Management Science | VOL. 15
John P EvansJohn P Evans
01 Jul 1969
Management Science | VOL. 15

On an Approach to Evaluation of Health Care Programme by Markov Decision Model
Masayuki Horiguchi
-
Masayuki HoriguchiMasayuki Horiguchi
01 Jan 2020
01 Jan 2020

Illustrated review of convergence conditions of the value iteration algorithm and the rolling horizon procedure for average-cost MDPs
Eugenio Della Vecchia ... Alain Jean-Marie
Annals of Operations Research | VOL. 199
Eugenio Della Vecchia, et. al.Eugenio Della Vecchia ... Alain Jean-Marie
02 Feb 2012
Annals of Operations Research | VOL. 199

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Semi-infinite weighted Markov decision processes with perturbation

Abstract

Talk to us

Similar Papers

More From: Mathematical Methods of Operational Research