Abstract

Stochastic Shortest Path MDPs (SSP-MDPs) are used to model probabilistic sequential decision problems where the objective is to minimize the expected accumulated cost to goal. However, in the presence of dead-ends, the conventional criterion for SSP-MDPs, which minimizes the expected accumulated cost, can become ill-defined. Lexicographic criteria can solve this by preferring policies that reach the goal with the highest possible probability. Other criteria can instead make a trade-off between some cost measure and probability-to-goal. However, both of these approaches can lead to policies that might not represent the choice of a real decision-maker. In this work, we propose the GUBS criterion to address these problems. GUBS combines goal prioritization over histories with Expected Utility Theory and is the only criterion between all criteria analyzed that not only allows for a trade-off between a large accumulated cost and a small loss in probability-to-goal, but also guarantees arbitrary trade-offs that can be tuned from its parameters without previous knowledge of the problem being solved. We also propose eGUBS, which is a particular case of GUBS when the exponential utility function is used, and two algorithms for optimally solving these problems: eGUBS-VI, a VI-based algorithm; and eGUBS-AO*, a heuristic search algorithm. Results indicate that, when there is a good heuristic function available or when the state space is too large, eGUBS-AO* can perform better than eGUBS-VI by doing an efficient search. In other cases, eGUBS-VI's simpler approach might have better results.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.