Abstract

As one of the most promising research areas in the field of reinforcement learning, multi-agent collaborative decision-making faces the challenge of real-time control in dynamic environments. This paper focuses on the path planning problem of multi-mobile sink collaborative data collection in the Internet of Things. Based on the excellent collaborative decision-making performance of multi-agent deep deterministic policy gradients (MADDPG), a novel path planning algorithm is proposed in the constructed wireless sensor network model. Furthermore, a combined prioritized experience replay strategy is designed to increase the utilization of important experiences in MADDPG. Extensive experiments conducted in various conditions show that compared with traditional MADDPG and MADDPG with prioritized experience replay, the proposed algorithm demonstrates the best performance. The accelerated convergence speed, enhanced training effectiveness and shortened paths indicate the capability of the proposed algorithm in multi-agent collaborative path planning problem.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call