Abstract

In a pursuit-evasion game, the pursuers usually can capture the evaders successfully when the practical application environment is similar to the one that the pursuers was trained on. However, when there are some pursuers broken down or some new pursuers joining, which will result in that the number of agents in practice is different from the number of agents that was trained on. In other words, the environment has changed. In multi-agent deep reinforcement leaning algorithm, which means that the input and output dimension of network has changed, the trained pursuers may can not capture the evaders in the real-world application. To solve this problem, we proposed a multi-agent reinforcement learning framework so that when the number of pursuers has changed, the pursuers can also capture the evaders. Based on deep deterministic policy gradient (DDPG) framework and Bi directional recurrent neural network (Bi-RNN), we proposed the scalable deep reinforcement learning method for pursuit-evasion game, and apply it into multi-agent pursuit-evasion game in 2D-Dynamic environment. In this game, the speed of evaders is higher than the pursuers, but the number of evaders is less than the pursuers. Our experimental results show that this algorithm can increase the scalability and stability of multi-agent pursuit-evasion game.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.