Abstract
This paper investigates the nonstationary continuous time Markov decision processes with the criterion of expected total rewards. Both the state space S and the action sets A(i) are countable, both the transition rates q ij (t,a) and the reward rate functions r,(t,a)are nonhomogeneous and unbounded. For this model, the optimality equation and the existence of ∊-optimal policies are proved. Finally, the period case for discounted criterion are discussed as the special case of the nonstationary one
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have