DESPOT: Online POMDP Planning with Regularization

Nan Ye,David Hsu,Wee Sun Lee,Adhiraj Somani

doi:10.1613/jair.5328

Abstract

The partially observable Markov decision process (POMDP) provides a principled general framework for planning under uncertainty, but solving POMDPs optimally is computationally intractable, due to the "curse of dimensionality" and the "curse of history". To overcome these challenges, we introduce the Determinized Sparse Partially Observable Tree (DESPOT), a sparse approximation of the standard belief tree, for online planning under uncertainty. A DESPOT focuses online planning on a set of randomly sampled scenarios and compactly captures the "execution" of all policies under these scenarios. We show that the best policy obtained from a DESPOT is near-optimal, with a regret bound that depends on the representation size of the optimal policy. Leveraging this result, we give an anytime online planning algorithm, which searches a DESPOT for a policy that optimizes a regularized objective function. Regularization balances the estimated value of a policy under the sampled scenarios and the policy size, thus avoiding overfitting. The algorithm demonstrates strong experimental results, compared with some of the best online POMDP algorithms available. It has also been incorporated into an autonomous driving system for real-time vehicle control. The source code for the algorithm is available online.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Journal of Artificial Intelligence Research	Publication Date: Jan 26, 2017
Citations: 125	License type: cc-by

R Discovery Prime

R Discovery Prime

DESPOT: Online POMDP Planning with Regularization

Abstract

Talk to us

Similar Papers

More From: Journal of Artificial Intelligence Research

Lead the way for us

Similar Papers

Closing the learning-planning loop with predictive state representations
...
-
, et. al. ...
10 May 2010
10 May 2010

Planning under Uncertainty with Multiple Heuristics

-

01 Jul 2019
01 Jul 2019

Importance sampling for online planning under uncertainty
Yuanfu Luo ... David Hsu
The International Journal of Robotics Research | VOL. 38
Yuanfu Luo, et. al.Yuanfu Luo ... David Hsu
19 Jun 2018
The International Journal of Robotics Research | VOL. 38

Deep reinforcement learning driven inspection and maintenance planning under incomplete information and constraints
C.P Andriotis ... K.G Papakonstantinou
Reliability Engineering & System Safety | VOL. 212
C.P Andriotis, et. al.C.P Andriotis ... K.G Papakonstantinou
11 Mar 2021
Reliability Engineering & System Safety | VOL. 212

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

DESPOT: Online POMDP Planning with Regularization

Abstract

Talk to us

Similar Papers

More From: Journal of Artificial Intelligence Research