Abstract

This paper is an attempt to study the first passage optimality criterion for continuous-time Markov decision processes with state-dependent discount factors and history-dependent policies. The state space is denumerable, the action space is a Borel space, and the transition and reward rates are unbounded. Under suitable conditions, we show the existence of a deterministic stationary optimal policy, establish the Bellman (optimality) equation, to which the value function is the unique solution, and give the value and policy iteration algorithms for solving (at least approximating) the value function and an optimal policy. Furthermore, we give examples about reliability and controlled birth processes with killing to illustrate the potential applications of the results obtained here, and also to show the difference between the main results in this paper and those in the previous literature.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.