Abstract

This paper investigates the piecewise deterministic Markov decision processes (PDMDPs) under the risk probability criterion. The optimality problem is to minimize the probability that the finite horizon total costs are no more than the cost goal. Under some suitable conditions, the value iteration algorithm is adopted to verify the existence of a solution to the optimality problem. Meanwhile, some new facts are established to prove that the value function is the unique solution to the optimality problem, and the existence of an optimal policy. Finally, two examples are presented to explain the application of risk probability PDMDPs, where the first one illustrates the verification of the main conditions, and the second one shows the calculation of the value function and an optimal risk probability policy.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call