Abstract

We study thepolicy iteration algorithm(PIA) for continuous-time jump Markov decision processes in general state and action spaces. The corresponding transition rates are allowed to beunbounded, and the reward rates may haveneither upper nor lower bounds. The criterion that we are concerned with isexpected average reward. We propose a set of conditions under which we first establish the average reward optimality equation and present the PIA. Then under twoslightlydifferent sets of conditions we show that the PIA yields the optimal (maximum) reward, an average optimal stationary policy, and a solution to the average reward optimality equation.

Highlights

  • In this paper we study the average reward optimality problem for continuous-time jump Markov decision processes MDPs in general state and action spaces

  • In the previous sections we have studied the policy iteration algorithm PIA for average reward continuous-time jump MDPs in Polish spaces

  • Under two slightly different sets of conditions we have shown that the PIA yields the optimal maximum reward, an average optimal stationary policy, and a solution to the average reward optimality equation

Read more

Summary

Introduction

In this paper we study the average reward optimality problem for continuous-time jump Markov decision processes MDPs in general state and action spaces. The approach to deal with this problem is by means of the well-known policy iteration algorithm PIA — known as Howard’s policy improvement algorithm. The PIA was originally introduced by Howard 1960 in 1 for finite MDPs i.e., the state and action spaces are both finite. By using the monotonicity of the sequence of iterated average rewards, he showed that the PIA converged with a finite number of steps. When a state space is not finite, there are well-known counterexamples to show that the PIA does not converge even though the action space is compact see 2–4 , e.g.,

Objectives
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call