Abstract

Path planning is one of the key research topics in robotics. Nowadays, researchers pay more attention to reinforcement learning (RL) and deep learning (DL) because of RL's good generality, self-learning ability, and DL's super leaning ability. Deep deterministic policy gradient (DDPG) algorithm, which combines the architectures of deep Q-learning (DQN), deterministic policy gradient (DPG) and Actor-Critic (AC), is different from the traditional RL methods and is suitable for continuous action space. Therefore, TPR-DDPG based path planning algorithm for mobile robots is proposed. In the algorithm, the state is preprocessed by various normalization methods, and complete reward-functions are designed to make agents reach the target point quickly by optimal paths in complex environments. The BatchNorm layer is added to the policy network, which ensures the stability of the algorithm. Finally, experimental results of agents' reaching the target points successfully through the paths generated by the improved DDPG validate the effectiveness of the proposed algorithm.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call