Abstract

Mobile robotics has a wide range of applications and path planning is key to its realization. Mobile robots need to explore the environment autonomously to find their destinations. The Deep Deterministic Policy Gradient (DDPG) algorithm, a classical algorithm in deep reinforcement learning, has a large advantage in continuous control problems. However, the DDPG algorithm suffers from the problems of low training efficiency and slow convergence caused by the high proportion of illegal policies due to the lack of policy action filtering. In this paper, we propose a mobile robot path planning method based on an improved DDPG reinforcement learning algorithm, which uses a small amount of a priori knowledge to accelerate the training of deep reinforcement learning, reduce the number of trial and error, and adopt an adaptive exploration method based on the $\varepsilon$ -greedy algorithm. Dynamically adjust the exploration factor to rationally allocate the probability of exploration and exploitation. The adaptive exploration method can improve the exploration efficiency, reduce the exploration duration and speed up the convergence of the algorithm. Simulation experiments are conducted in a grid environment, and the results show that the proposed algorithm can successfully find the optimal path. Moreover, the comparison experiments between Q-learning, SARSA and the proposed algorithm demonstrate that the proposed algorithm has better path planning performance, spends the least computation time and converges the fastest.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call