Parallel Rollout for Online Solution of Partially Observable Markov Decision Processes

Hyeong Soo Chang,Edwin K P Chong,Robert Givan

doi:10.1023/b:disc.0000028199.78776.c4

Abstract

We propose a novel approach, called parallel rollout, to solving (partially observable) Markov decision processes. Our approach generalizes the rollout algorithm of Bertsekas and Castanon (1999) by rolling out a set of multiple heuristic policies rather than a single policy. In particular, the parallel rollout approach aims at the class of problems where we have multiple heuristic policies available such that each policy performs near-optimal for a different set of system paths. Parallel rollout automatically combines the given multiple policies to create a new policy that adapts to the different system paths and improves the performance of each policy in the set. We formally prove this claim for two criteria: total expected reward and infinite horizon discounted reward. The parallel rollout approach also resolves the key issue of selecting which policy to roll out among multiple heuristic policies whose performances cannot be predicted in advance. We present two example problems to illustrate the effectiveness of the parallel rollout approach: a buffer management problem and a multiclass scheduling problem.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Parallel Rollout for Online Solution of Partially Observable Markov Decision Processes

Abstract

Talk to us

Similar Papers

More From: Discrete Event Dynamic Systems

Lead the way for us

Journal: Discrete Event Dynamic Systems	Publication Date: Jul 1, 2004
Citations: 79

Similar Papers

Multi-policy improvement in stochastic optimization with forward recursive function criteria
Hyeong Soo Chang
Journal of Mathematical Analysis and Applications | VOL. 305
Hyeong Soo ChangHyeong Soo Chang
07 Jan 2005
Journal of Mathematical Analysis and Applications | VOL. 305

Parallelizing Parallel Rollout Algorithm for Solving Markov Decision Processes
Seon Wook Kim ... Hyeong Soo Chang
-
Seon Wook Kim, et. al.Seon Wook Kim ... Hyeong Soo Chang
01 Jan 2003
01 Jan 2003

Receding horizon approach to Markov games for infinite horizon discounted cost
Hyeong Soo Chang ... S.I Marcus
-
Hyeong Soo Chang, et. al. Hyeong Soo Chang ... S.I Marcus
10 Dec 2002
10 Dec 2002

Solving Markov Decision Processes via Simulation
Abhijit Gosavi
-
Abhijit GosaviAbhijit Gosavi
18 Sep 2014
18 Sep 2014

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Parallel Rollout for Online Solution of Partially Observable Markov Decision Processes

Abstract

Talk to us

Similar Papers

More From: Discrete Event Dynamic Systems