Abstract

This paper investigates the nonstationary continuous time Markov decision processes with the criterion of expected total rewards. Both the state space S and the action sets A(i) are countable, both the transition rates q ij (t,a) and the reward rate functions r,(t,a)are nonhomogeneous and unbounded. For this model, the optimality equation and the existence of ∊-optimal policies are proved. Finally, the period case for discounted criterion are discussed as the special case of the nonstationary one

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call