Reward revision and the average reward markov decision process

C C White,W T Scherer

doi:10.1007/bf01719829

Reward revision and the average reward markov decision process

C C White, W T Scherer

https://doi.org/10.1007/bf01719829

Copy DOI

Journal: Operations-Research-Spektrum	Publication Date: Dec 1, 1987
Citations: 1

#Average Reward Markov Decision Process #Markov Decision Process + Show 8 more

Abstract
Full-Text PDF
Similar Papers

Abstract

We integrate two numerical procedures for solving the average reward Markov decision process (MDP), standard successive approximations and modified policy iteration with reward revision. Reward revision is the process of revising the reward structure of a second, more computationally desirable MDP so as to produce, in the limit, an optimality equation having a fixed point identical to that associated with the original MDP. A numerical study indicates that for MDP's having a non-sparse transition structure with a small number of relatively large entries per row, the addition of reward revision can have significant computational benefits.

Full Text