Deep reinforcement learning methods have been applied to mobile robot navigation to find the optimal path to the target. The rewards are usually given when the task is completed, which may lead to the local optima during the training procedure. It seriously affects the training efficiency and navigation performance of the mobile robot. To this end, this paper proposes an intrinsic reward mechanism with intrinsic curiosity module and randomness enhanced module, combining the TD3 (twin-delayed deep deterministic policy gradient) reinforcement learning algorithm for mobile robot navigation. It effectively resolves the issue of slow convergence caused by sparse rewards in continuous action spaces. It also encourages mobile robots to explore unknown areas and reduces the occurrence of local optima. The experimental results show that the proposed navigation method significantly improves the training efficiency of mobile robots. Out of 1000 test episodes, only 3 exceeded the maximum step limit. This approach significantly reduces the occurrence of local optima. Furthermore, it increases the success rate to an impressive 83.5%, outperforms the existing navigation methods.
Read full abstract