Improving the Convergence of RTDP

Tian Liang,Minghao Yin,Jigui Sun

doi:10.1109/fskd.2007.360

Abstract

Real-time dynamic programming (RTDP) is an outstanding real-time algorithm for solving non-deterministic planning problems with full observability. RTDP has two key advantages comparing with other DP algorithms: first, it obtain an optimal policy without computing the whole space, second, it has a good anytime behavior. However, RTDP's convergence is slow. In this paper, we introduce RTDP(k), an algorithm based on RTDP with a similar structure. RTDP(k) improves the convergence as well as holds real-time algorithm properties. RTDP(k) updates k extended states per iteration following a "bounded propagation strategy". In Markov decision processes (MDPs), especially in stochastic shortest-path problems (SSPs), we have proved two points: first, every RTDP(k) trial terminates in a finite number of steps ,second, RTDP(k) eventually converges to an optimal policy. From a practical point of view, We show that RTDP(k) produces better solutions in the first trial and converges faster than RTDP on benchmarks for real-time search.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Improving the Convergence of RTDP

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Frontier-Based RTDP: A New Approach to Solving the Robotic Adversarial Coverage Problem
...
-
, et. al. ...
04 May 2015
04 May 2015

Online learning for wireless video transmission with limited information
Yu Zhang ... Mihaela Van Der Schaar
-
Yu Zhang, et. al. Yu Zhang ... Mihaela Van Der Schaar
01 May 2009
01 May 2009

UAV Payload Transportation via RTDP Based Optimized Velocity Profiles
Abdullah Mohiuddin ... Dongming Gan
Energies | VOL. 12
Abdullah Mohiuddin, et. al.Abdullah Mohiuddin ... Dongming Gan
08 Aug 2019
Energies | VOL. 12

Optimal Action Criterion and Algorithm Improvement of Real-Time Dynamic Programming
Chang-Jie Fan ... Xiao-Ping Chen
Journal of Software | VOL. 19
Chang-Jie Fan, et. al.Chang-Jie Fan ... Xiao-Ping Chen
07 Apr 2009
Journal of Software | VOL. 19

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Improving the Convergence of RTDP

Abstract

Talk to us

Similar Papers