Closing the learning-planning loop with predictive state representations

Byron Boots,Sajid M Siddiqi,Geoffrey J Gordon

doi:10.1177/0278364911404092

Abstract

A central problem in artificial intelligence is to choose actions to maximize reward in a partially observable, uncertain environment. To do so, we must learn an accurate environment model, and then plan to maximize reward. Unfortunately, learning algorithms often recover a model that is too inaccurate to support planning or too large and complex for planning to succeed; or they require excessive prior domain knowledge or fail to provide guarantees such as statistical consistency. To address this gap, we propose a novel algorithm which provably learns a compact, accurate model directly from sequences of action-observation pairs. We then evaluate the learner by closing the loop from observations to actions. In more detail, we present a spectral algorithm for learning a predictive state representation (PSR), and evaluate it in a simulated, vision-based mobile robot planning task, showing that the learned PSR captures the essential features of the environment and enables successful and efficient planning. Our algorithm has several benefits which have not appeared together in any previous PSR learner: it is computationally efficient and statistically consistent; it handles high-dimensional observations and long time horizons; and, our close-the-loop experiments provide an end-to-end practical test.

Highlights

We propose a novel algorithm for learning a variant of Predictive State Representations (PSRs) [12] directly from execution traces
We propose a novel algorithm for learning a variant of PSRs [12] directly from execution traces
Our algorithm is closely related to subspace identification for linear dynamical systems (LDSs) [15] and spectral algorithms for Hidden Markov Models (HMMs) [5] and reduced-rank HMMs [13]

Summary

Introduction

We propose a novel algorithm for learning a variant of PSRs [12] directly from execution traces. We show that the learned state space compactly captures the essential features of the environment, allows accurate prediction, and enables successful and efficient planning.

Results

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: The International Journal of Robotics Research	Publication Date: Jun 1, 2011
Citations: 187	License type: cc-by

R Discovery Prime

R Discovery Prime

Closing the learning-planning loop with predictive state representations

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: The International Journal of Robotics Research

Lead the way for us

Similar Papers

Closing the Learning-Planning Loop with Predictive State Representations
B Boots ... G Gordon
-
B Boots, et. al.B Boots ... G Gordon
27 Jun 2010
27 Jun 2010

Closing the learning-planning loop with predictive state representations
...
-
, et. al. ...
10 May 2010
10 May 2010

Closing the Learning-Planning Loop with Predictive State Representations
Byron Boots ... Geoffrey J Gordon
-
Byron Boots, et. al.Byron Boots ... Geoffrey J Gordon
05 Aug 2011
05 Aug 2011

Learning and planning in partially observable environments without prior domain knowledge
Yunlong Liu ... Fangfang Chang
International Journal of Approximate Reasoning | VOL. 142
Yunlong Liu, et. al.Yunlong Liu ... Fangfang Chang
08 Dec 2021
International Journal of Approximate Reasoning | VOL. 142

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Closing the learning-planning loop with predictive state representations

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: The International Journal of Robotics Research