Towards scalable mdp algorithms

Andrey Kolobov

doi:10.5591/978-1-57735-516-8/ijcai11-481

Abstract

Since the emergence of Artificial Intelligence as a field, planning under uncertainty has been viewed as one of its crucial subareas. Accordingly, the current lack of scalability of probabilistic planning techniques is a major reason why the grand vision of AI has not been fulfilled yet. Besides being an obstacle to advances in AI, scalability issues also hamper the applicability of probabilistic planning to real-world problems. A powerful framework for describing probabilistic planning problems is Markov Decision Processes (MDPs). Informally, an MDP specifies the objectives the agent is trying to achieve, the actions the agent can perform, and the states in which it can end up while working towards the objective. Solving an MDP means finding a policy, i.e. an assignment of actions to states, that allows the agent to achieve the objective. Optimal solution methods, those that look for the “best” policy according to some criterion, typically try to analyze all possible states or a large fraction of them. Since state space sizes of realistic scenarios can be astronomical, these algorithms quickly run out of memory. Fortunately, the mathematical structure of some classes of MDPs has allowed for inventing more efficient algorithms. This is the case for Stochastic Shortest Path (SSP) problems, whose mathematical properties gave rise to a family of algorithms called Find-and-Revise. When used in combination with a heuristic, the members of this family can find a near-optimal, or even optimal, policy while avoiding the analysis of many states. Nonetheless, the sheer number of states in real-world problems forces algorithms based on state-level analysis to exhaust their capabilities much too early, calling for a fundamentally different approach. Moreover, several expressive and potentially useful MDP classes are currently not known to have a mathematical structure for creating efficient approximation techniques. This dissertation advances the state of the art in probabilistic planning in three complementary ways. For SSP MDPs, it derives a novel class of approximate algorithms based on generalizing state analysis information across many states. This information is stored in the form of basis functions that are generated automatically and the number of which is much smaller than the number of states. As a result, the proposed algorithms have a very compact memory footprint and arrive at high-quality solutions faster than their state-based counterparts. In a parallel effort, the dissertation also builds up mathematical apparatus for classes of MDPs that previously had no applicable approximation algorithms. The developed theory enables the extension of the powerful Find-and-Revise paradigm to these MDP classes as well, providing the first memory-efficient algorithms for solving them. Last but not least, the dissertation will apply the proposed theoretical techniques to a large-scale real-world problem, urban traffic routing being one of the candidates.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Towards scalable mdp algorithms

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Belief State Planning for Autonomous Driving: Planning with Interaction, Uncertain Prediction and Uncertain Perception

-

29 Jul 2020
29 Jul 2020

Intelligent control for scalable video processing

-

18 Nov 2015
18 Nov 2015

Practical Resolution Methods for MDPs in Robotics Exemplified With Disassembly Planning
Alejandro Suarez-Hernandez ... Carme Torras
IEEE Robotics and Automation Letters | VOL. 4
Alejandro Suarez-Hernandez, et. al.Alejandro Suarez-Hernandez ... Carme Torras
01 Jul 2019
IEEE Robotics and Automation Letters | VOL. 4

Risk-Sensitive Piecewise-Linear Policy Iteration for Stochastic Shortest Path Markov Decision Processes
Henrique Dias Pastor ... Igor Oliveira Borges
-
Henrique Dias Pastor, et. al.Henrique Dias Pastor ... Igor Oliveira Borges
01 Jan 2020
01 Jan 2020

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Towards scalable mdp algorithms

Abstract

Talk to us

Similar Papers