Abstract

In this paper, we introduce an optimization method to improve pursuit performance of a pursuer in a pursuit-evasion game (PEG). Pursuers build a probability map and employ a hybrid pursuit policy which combines the merits of local-max and global-max pursuit policies to search and capture evaders as soon as possible in a 2-dimensional space. We propose an episodic parameter optimization (EPO) algorithm to learn good values for the weighting parameters of a hybrid pursuit policy. The EPO algorithm is performed while many episodes of the PEG are run repeatedly and the reward of each episode is accumulated using reinforcement learning, and the candidate weighting parameter is selected in a way that maximizes the total averaged reward by using the golden section search method. We found the best pursuit policy in various situations which are the different number of evaders and the different size of spaces and analyzed results.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.