Average and blackwell optimal policies in denumerable Markov decision chains

Arie Hordijk

doi:10.1109/cdc.1986.267436

Abstract

This talk is based on the paper Dekker and Hordijk [1986]. In this paper we consider a (discrete-time) Markov decision chain with a denumerable state space and compact action sets and we assume that for all states the rewards and transition probabilities depend continuously on the actions. The first objective of this paper is to develop an analysis for average optimality without assuming a special Markov chain structure. In doing so, we present a set of conditions guaranteeing average optimality, which are automatically fulfilled in the finite state and action model. The second objective is to study simultaneously average and discount optimality as VEINOTT [1969] did for the finite state and action model. We investigate the concepts of n-discount and Blackwell optimality in the denumerable state space, using a Laurent series expansion for the discounted rewards. Under the same condition as for average optimality, we establish solutions to the n-discount optimality equations for every n.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Average and blackwell optimal policies in denumerable Markov decision chains

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Average, Sensitive and Blackwell Optimal Policies in Denumerable Markov Decision Chains with Unbounded Rewards
Rommert Dekker ... Arie Hordijk
Mathematics of Operations Research | VOL. 13
Rommert Dekker, et. al.Rommert Dekker ... Arie Hordijk
01 Aug 1988
Mathematics of Operations Research | VOL. 13

Recurrence Conditions for Average and Blackwell Optimality in Denumerable State Markov Decision Chains
Rommert Dekker ... Arie Hordijk
Mathematics of Operations Research | VOL. 17
Rommert Dekker, et. al.Rommert Dekker ... Arie Hordijk
01 Jan 1992
Mathematics of Operations Research | VOL. 17

Blackwell Optimality in Borelian Continuous-in-Action Markov Decision Processes
Alexander A Yushkevich
SIAM Journal on Control and Optimization | VOL. 35
Alexander A YushkevichAlexander A Yushkevich
01 Nov 1997
SIAM Journal on Control and Optimization | VOL. 35

Approximation of two-person zero-sum continuous-time Markov games with average payoff criterion
José María Lorenzo ... Tomás Prieto-Rumeau
Operations Research Letters | VOL. 43
José María Lorenzo, et. al.José María Lorenzo ... Tomás Prieto-Rumeau
12 Dec 2014
Operations Research Letters | VOL. 43

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Average and blackwell optimal policies in denumerable Markov decision chains

Abstract

Talk to us

Similar Papers