Abstract

Partially Observable Markov Decision Processes (POMDP) provides piecewise-linear a natural and principled framework for sequential decision-making under uncertainty. However, large-scale POMDP suffers from the exponential growth of the belief points and policy trees space. We present a new point-based incremental pruning algorithm based on the piecewise linearity and convexity of the value function. Instead of reasoning about the whole belief space when pruning the cross-sums in POMDP policy construction, our algorithm uses belief points to perform approximate pruning by generating policy trees, and get the optimal policy in real-time belief states. The empirical results indicate that point-based incremental pruning for heuristic search methods can handle large POMDP domains efficiently.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.