Bisimulation Metrics for Continuous Markov Decision Processes

Norm Ferns,Prakash Panangaden,Doina Precup

doi:10.1137/10080484x

Abstract

In recent years, various metrics have been developed for measuring the behavioral similarity of states in probabilistic transition systems [J. Desharnais et al., Proceedings of CONCUR'99, Springer-Verlag, London, 1999, pp. 258–273; F. van Breugel and J. Worrell, Proceedings of ICALP'01, Springer-Verlag, London, 2001, pp. 421–432]. In the context of finite Markov decision processes (MDPs), we have built on these metrics to provide a robust quantitative analogue of stochastic bisimulation [N. Ferns, P. Panangaden, and D. Precup, Proceedings of UAI-04, AUAI Press, Arlington, VA, 2004, pp. 162–169] and an efficient algorithm for its calculation [N. Ferns, P. Panangaden, and D. Precup, Proceedings of UAI-06, AUAI Press, Arlington, VA, 2006, pp. 174–181]. In this paper, we seek to properly extend these bisimulation metrics to MDPs with continuous state spaces. In particular, we provide the first distance-estimation scheme for metrics based on bisimulation for continuous probabilistic transition systems. Our work, based on statistical sampling and infinite dimensional linear programming, is a crucial first step in formally guiding real-world planning, where tasks are usually continuous and highly stochastic in nature, e.g., robot navigation, and often a substitution with a parametric model or crude finite approximation must be made. We show that the optimal value function associated with a discounted infinite-horizon planning task is continuous with respect to metric distances. Thus, our metrics allow one to reason about the quality of solution obtained by replacing one model with another. Alternatively, they may potentially be used directly for state aggregation.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Bisimulation Metrics for Continuous Markov Decision Processes

Abstract

Talk to us

Similar Papers

More From: SIAM Journal on Computing

Lead the way for us

Journal: SIAM Journal on Computing	Publication Date: Jan 1, 2011
Citations: 100

Similar Papers

Weak Bisimulation Metrics in Models with Nondeterminism and Continuous State Spaces
Ruggero Lanotte ... Simone Tini
-
Ruggero Lanotte, et. al.Ruggero Lanotte ... Simone Tini
01 Jan 2018
01 Jan 2018

A weak semantic approach to bisimulation metrics in models with nondeterminism and continuous state spaces
Ruggero Lanotte ... Simone Tini
Theoretical Computer Science | VOL. 869
Ruggero Lanotte, et. al.Ruggero Lanotte ... Simone Tini
15 Jan 2021
Theoretical Computer Science | VOL. 869

Scalable Methods for Computing State Similarity in Deterministic Markov Decision Processes
Pablo Samuel Castro
Proceedings of the AAAI Conference on Artificial Intelligence | VOL. 34
Pablo Samuel CastroPablo Samuel Castro
03 Apr 2020
Proceedings of the AAAI Conference on Artificial Intelligence | VOL. 34

Testing Labelled Markov Processes
Franck Van Breugel ... James Worrell
-
Franck Van Breugel, et. al.Franck Van Breugel ... James Worrell
01 Jan 2002
01 Jan 2002

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Bisimulation Metrics for Continuous Markov Decision Processes

Abstract

Talk to us

Similar Papers

More From: SIAM Journal on Computing