Equivalence notions and model minimization in Markov decision processes

Robert Givan,Thomas Dean,Matthew Greig

doi:10.1016/s0004-3702(02)00376-4

Robert Givan, Thomas Dean + Show 1 more

https://doi.org/10.1016/s0004-3702(02)00376-4

Copy DOI

Abstract

Many stochastic planning problems can be represented using Markov Decision Processes (MDPs). A difficulty with using these MDP representations is that the common algorithms for solving them run in time polynomial in the size of the state space, where this size is extremely large for most real-world planning problems of interest. Recent AI research has addressed this problem by representing the MDP in a factored form. Factored MDPs, however, are not amenable to traditional solution methods that call for an explicit enumeration of the state space. One familiar way to solve MDP problems with very large state spaces is to form a reduced (or aggregated) MDP with the same properties as the original MDP by combining “equivalent” states. In this paper, we discuss applying this approach to solving factored MDP problems—we avoid enumerating the state space by describing large blocks of “equivalent” states in factored form, with the block descriptions being inferred directly from the original factored representation. The resulting reduced MDP may have exponentially fewer states than the original factored MDP, and can then be solved using traditional methods. The reduced MDP found depends on the notion of equivalence between states used in the aggregation. The notion of equivalence chosen will be fundamental in designing and analyzing algorithms for reducing MDPs. Optimally, these algorithms will be able to find the smallest possible reduced MDP for any given input MDP and notion of equivalence (i.e., find the “minimal model” for the input MDP). Unfortunately, the classic notion of state equivalence from non-deterministic finite state machines generalized to MDPs does not prove useful. We present here a notion of equivalence that is based upon the notion of bisimulation from the literature on concurrent processes. Our generalization of bisimulation to stochastic processes yields a non-trivial notion of state equivalence that guarantees the optimal policy for the reduced model immediately induces a corresponding optimal policy for the original model. With this notion of state equivalence, we design and analyze an algorithm that minimizes arbitrary factored MDPs and compare this method analytically to previous algorithms for solving factored MDPs. We show that previous approaches implicitly derive equivalence relations that we define here.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Artificial Intelligence	Publication Date: Feb 12, 2003
Citations: 326	License type: elsevier-specific

R Discovery Prime

R Discovery Prime

Equivalence notions and model minimization in Markov decision processes

Abstract

Talk to us

Similar Papers

More From: Artificial Intelligence

Lead the way for us

Similar Papers

Solving K-MDPs
Jonathan Ferrer-Mestres ... Olivier Buffet
Proceedings of the International Conference on Automated Planning and Scheduling | VOL. 30
Jonathan Ferrer-Mestres, et. al.Jonathan Ferrer-Mestres ... Olivier Buffet
01 Jun 2020
Proceedings of the International Conference on Automated Planning and Scheduling | VOL. 30

Suboptimal policy determination for large-scale Markov decision processes, Part 1: Description and bounds
C C White ... J L Popyack
Journal of Optimization Theory and Applications | VOL. 46
C C White, et. al.C C White ... J L Popyack
01 Jul 1985
Journal of Optimization Theory and Applications | VOL. 46

Hierarchical Approximate Policy Iteration With Binary-Tree State Space Decomposition
Xin Xu ... Dewen Hu
IEEE Transactions on Neural Networks | VOL. 22
Xin Xu, et. al. Xin Xu ... Dewen Hu
10 Oct 2011
IEEE Transactions on Neural Networks | VOL. 22

Distributed Service Migration in Satellite Mobile Edge Computing
Zhen Li ... Chunxiao Jiang
-
Zhen Li, et. al.Zhen Li ... Chunxiao Jiang
01 Dec 2021
01 Dec 2021

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Equivalence notions and model minimization in Markov decision processes

Abstract

Talk to us

Similar Papers

More From: Artificial Intelligence