Abstract

In this paper, Prioritized Experience Replay (PER) strategy and Long Short Term Memory (LSTM) neural network are introduced to the path planning process of mobile robots, which solves the problems of slow convergence and inaccurate perception of dynamic obstacles with the Twin Delayed Deep Deterministic Policy Gradient (TD3) algorithm. We dubbed this new method as PL-TD3. Firstly, we improve the convergence speed of the algorithm by introducing PER strategy. Secondly, we use LSTM neural network to achieve the improvement of the algorithm for dynamic obstacle perception. In order to verify the method of this paper, we design static environment, dynamic environment and adaptability to dynamic experiments to compare and analyze the methods before and after improvement. The experimental results show that PL-TD3 outperforms TD3 in terms of execution time and execution path length in all environments.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call