A Continuous-Time Markov Decision Process-Based Method With Application in a Pursuit-Evasion Example

Shengde Jia,Xiangke Wang,Lincheng Shen

doi:10.1109/tsmc.2015.2478875

Abstract

This paper presents a novel method—continuous-time Markov decision process (CTMDP)—to address the uncertainties in pursuit-evasion problem. The primary difference between the CTMDP and the Markov decision process (MDP) is that the former takes into account the influence of the transition time between the states. The policy iteration method-based potential performance for solving the CTMDP and its convergence are also presented. The results obtained by MDP-based method demonstrate that it is a special case of CTMDP-based method involving the identity transition rate matrix. To compare the methods, a well-known pursuit-evasion problem, involving two identical cars, is solved as a benchmark. The CTMDP-based method can provide a discretization solution that is close to the analytical solution obtained by the differential game method. Besides, it shows strong robustness against changes in the transition probability, as compared with the traditional MDP-based method. To the best of our knowledge, this is the first attempt to validate the influence of the transition time between the states in such a pursuit-evasion scenario, or in a similar application, solved by an MDP-related model. The CTMDP-based method offers a new approach to solving the pursuit-evasion problem and can be extended to similar optimization applications.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

A Continuous-Time Markov Decision Process-Based Method With Application in a Pursuit-Evasion Example

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Systems, Man, and Cybernetics: Systems

Lead the way for us

Journal: IEEE Transactions on Systems, Man, and Cybernetics: Systems	Publication Date: Sep 1, 2016
Citations: 37

Similar Papers

Semi-Markov and Jump Markov Controlled Models: Average Cost Criterion
M Yu Kitayev
Theory of Probability & Its Applications | VOL. 30
M Yu KitayevM Yu Kitayev
01 Jun 1986
Theory of Probability & Its Applications | VOL. 30

Contraction Mappings in the Theory Underlying Dynamic Programming
Eric V Denardo
SIAM Review | VOL. 9
Eric V DenardoEric V Denardo
01 Apr 1967
SIAM Review | VOL. 9

A Continuous-time Markov Decision Process Based Method on Pursuit-Evasion Problem
Jia Shengde ... Zhu Huayong
IFAC Proceedings Volumes | VOL. 47
Jia Shengde, et. al.Jia Shengde ... Zhu Huayong
01 Jan 2014
IFAC Proceedings Volumes | VOL. 47

Continuous-time Markov decision process with average reward: Using reinforcement learning method
Shengde Jia ... Lincheng Shen
-
Shengde Jia, et. al.Shengde Jia ... Lincheng Shen
01 Jul 2015
01 Jul 2015

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A Continuous-Time Markov Decision Process-Based Method With Application in a Pursuit-Evasion Example

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Systems, Man, and Cybernetics: Systems