Abstract
This paper presents a novel method—continuous-time Markov decision process (CTMDP)—to address the uncertainties in pursuit-evasion problem. The primary difference between the CTMDP and the Markov decision process (MDP) is that the former takes into account the influence of the transition time between the states. The policy iteration method-based potential performance for solving the CTMDP and its convergence are also presented. The results obtained by MDP-based method demonstrate that it is a special case of CTMDP-based method involving the identity transition rate matrix. To compare the methods, a well-known pursuit-evasion problem, involving two identical cars, is solved as a benchmark. The CTMDP-based method can provide a discretization solution that is close to the analytical solution obtained by the differential game method. Besides, it shows strong robustness against changes in the transition probability, as compared with the traditional MDP-based method. To the best of our knowledge, this is the first attempt to validate the influence of the transition time between the states in such a pursuit-evasion scenario, or in a similar application, solved by an MDP-related model. The CTMDP-based method offers a new approach to solving the pursuit-evasion problem and can be extended to similar optimization applications.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: IEEE Transactions on Systems, Man, and Cybernetics: Systems
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.