Focused Topological Value Iteration

Peng Dai,Mausam Mausam,Daniel Weld

doi:10.1609/icaps.v19i1.18138

Abstract

Topological value iteration (TVI) is an effective algorithm for solving Markov decision processes (MDPs) optimally, which 1) divides an MDP into strongly-connected components, and 2) solves these components sequentially. Yet, TVI’s usefulness tends to degrade if an MDP has large components, because the cost of the division process isn’t offset by gains during solution. This paper presents a new algorithm to solve MDPs optimally, focused topological value iteration (FTVI). FTVI addresses TVI’s limitations by restricting its attention to connected components that are relevant for solving the MDP. Specifically, FTVI uses a small amount of heuristic search to eliminate provably sub-optimal actions; this pruning allows FTVI to find smaller connected components, thus running faster. We demonstrate that our new algorithm outperforms TVI by an order of magnitude, averaged across several domains. Surprisingly, FTVI also significantly outperforms popular ‘heuristically-informed’ MDP algorithms such as LAO*, LRTDP, and BRTDP in many domains, sometimes by as much as two orders of magnitude. Finally, we characterize the type of domains where FTVI excels — suggesting a way to an informed choice of solver.

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Proceedings of the International Conference on Automated Planning and Scheduling	Publication Date: May 24, 2021
Citations: 9	License type: cc-by

R Discovery Prime

R Discovery Prime

Focused Topological Value Iteration

Abstract

Talk to us

Similar Papers

More From: Proceedings of the International Conference on Automated Planning and Scheduling

Lead the way for us

Similar Papers

Topological Value Iteration Algorithms
...
-
, et. al. ...
01 Sep 2011
01 Sep 2011

Efficient Algorithms for Budget-Constrained Markov Decision Processes
Constantine Caramanis ... David P Morton
IEEE Transactions on Automatic Control | VOL. 59
Constantine Caramanis, et. al.Constantine Caramanis ... David P Morton
01 Oct 2014
IEEE Transactions on Automatic Control | VOL. 59

A Simulation-Based Policy Iteration Algorithm for Average Cost Unichain Markov Decision Processes
Ying He ... Steven I Marcus
-
Ying He, et. al.Ying He ... Steven I Marcus
01 Jan 1999
01 Jan 1999

Numerical Simulation of Time-Optimal Path Planning for Autonomous Underwater Vehicles Using a Markov Decision Process Method
Mingrui Shu ... Kaiyong Wang
Applied Sciences | VOL. 12
Mingrui Shu, et. al.Mingrui Shu ... Kaiyong Wang
17 Mar 2022
Applied Sciences | VOL. 12

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Focused Topological Value Iteration

Abstract

Talk to us

Similar Papers

More From: Proceedings of the International Conference on Automated Planning and Scheduling