Uniformization for semi-Markov decision processes under stationary policies

Frederick J Beutler,Keith W Ross

doi:10.1017/s0021900200031375

Abstract

Uniformization permits the replacement of a semi-Markov decision process (SMDP) by a Markov chain exhibiting the same average rewards for simple (non-randomized) policies. It is shown that various anomalies may occur, especially for stationary (randomized) policies; uniformization introduces virtual jumps with concomitant action changes not present in the original process. Since these lead to discrepancies in the average rewards for stationary processes, uniformization can be accepted as valid only for simple policies. We generalize uniformization to yield consistent results for stationary policies also. These results are applied to constrained optimization of SMDP, in which stationary (randomized) policies appear naturally. The structure of optimal constrained SMDP policies can then be elucidated by studying the corresponding controlled Markov chains. Moreover, constrained SMDP optimal policy computations can be more easily implemented in discrete time, the generalized uniformization being employed to relate discrete- and continuous-time optimal constrained policies.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Uniformization for semi-Markov decision processes under stationary policies

Abstract

Talk to us

Similar Papers

More From: Journal of Applied Probability

Lead the way for us

Journal: Journal of Applied Probability	Publication Date: Sep 1, 1987
Citations: 25

Similar Papers

Uniformization for semi-Markov decision processes under stationary policies
Frederick J Beutler ... Keith W Ross
Journal of Applied Probability | VOL. 24
Frederick J Beutler, et. al.Frederick J Beutler ... Keith W Ross
01 Sep 1987
Journal of Applied Probability | VOL. 24

Semi-Markov and Jump Markov Controlled Models: Average Cost Criterion
M Yu Kitayev
Theory of Probability & Its Applications | VOL. 30
M Yu KitayevM Yu Kitayev
01 Jun 1986
Theory of Probability & Its Applications | VOL. 30

Discrete-time equivalence for constrained semi-Markov decision processes
Frederick Beutler ... Keith Ross
-
Frederick Beutler, et. al.Frederick Beutler ... Keith Ross
01 Dec 1985
01 Dec 1985

First passage models for denumerable semi-Markov decision processes with nonnegative discounted costs
Yong-Hui Huang ... Guo Xian-Ping
Acta Mathematicae Applicatae Sinica, English Series | VOL. 27
Yong-Hui Huang, et. al.Yong-Hui Huang ... Guo Xian-Ping
12 Mar 2011
Acta Mathematicae Applicatae Sinica, English Series | VOL. 27

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Uniformization for semi-Markov decision processes under stationary policies

Abstract

Talk to us

Similar Papers

More From: Journal of Applied Probability