Covering Number as a Complexity Measure for POMDP Planning and Learning

Zongzhang Zhang,Michael Littman,Xiaoping Chen

doi:10.1609/aaai.v26i1.8360

Abstract

Finding a meaningful way of characterizing the difficulty of partially observable Markov decision processes (POMDPs) is a core theoretical problem in POMDP research. State-space size is often used as a proxy for POMDP difficulty, but it is a weak metric at best. Existing work has shown that the covering number for the reachable belief space, which is a set of belief points that are reachable from the initial belief point, has interesting links with the complexity of POMDP planning, theoretically. In this paper, we present empirical evidence that the covering number for the reachable belief space (or just ``covering number", for brevity) is a far better complexity measure than the state-space size for both planning and learning POMDPs on several small-scale benchmark problems. We connect the covering number to the complexity of learning POMDPs by proposing a provably convergent learning algorithm for POMDPs without reset given knowledge of the covering number.

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Covering Number as a Complexity Measure for POMDP Planning and Learning

Abstract

Talk to us

Similar Papers

More From: Proceedings of the ... AAAI Conference on Artificial Intelligence. AAAI Conference on Artificial Intelligence

Lead the way for us

Journal: Proceedings of the ... AAAI Conference on Artificial Intelligence. AAAI Conference on Artificial Intelligence	Publication Date: Sep 20, 2021
Citations: 6

Similar Papers

Algorithms for partially observable Markov decision processes
Weihong Zhang
-
Weihong ZhangWeihong Zhang
23 Dec 2014
23 Dec 2014

Efficient Planning in Large POMDPs through Policy Graph Based Factorized Approximations
Joni Pajarinen ... Jaakko Peltonen
-
Joni Pajarinen, et. al.Joni Pajarinen ... Jaakko Peltonen
01 Jan 2009
01 Jan 2009

Tractable POMDP-planning for robots with complex non-linear dynamics
Marcus Hoerger
-
Marcus HoergerMarcus Hoerger
16 Mar 2020
16 Mar 2020

A Bayesian game based adaptive fuzzy controller for multiagent POMDPs
Rajneesh Sharma ... Matthijs T J Spaan
-
Rajneesh Sharma, et. al.Rajneesh Sharma ... Matthijs T J Spaan
01 Jul 2010
01 Jul 2010

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Covering Number as a Complexity Measure for POMDP Planning and Learning

Abstract

Talk to us

Similar Papers

More From: Proceedings of the ... AAAI Conference on Artificial Intelligence. AAAI Conference on Artificial Intelligence