Abstract

This paper studies a multiconstrained problem for piecewise deterministic Markov decision processes (PDMDPs) with unbounded cost and transition rates. The goal is to minimize one type of expected finite-horizon cost over history-dependent policies while keeping some other types of expected finite-horizon costs lower than some tolerable bounds. Using the Dynkin formula for the PDMDPs, we obtain an equivalent characterization of occupancy measures and express the expected finite-horizon costs in terms of occupancy measures. Under suitable assumptions, the existence of constrained-optimal policies is shown, the linear programming formulation and its dual program for the constrained problem are derived, and the strong duality between the two programs is established. An example is provided to demonstrate our results.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call