Abstract
A new deep deterministic policy gradient (DDPG) integrating kinematics analysis and immune optimization (KAI-DDPG) is proposed to address the drawbacks of DDPG in path planning. An orientation angle reward component, linear velocity reward factor, and safety performance reward factor are added to the DDPG reward function based on kinematic modeling and analysis of mobile robots. A multi-objective performance index turns the path planning problem into one of multi-objective optimization. We propose KA-DDPG, which uses the orientation angle, linear speed, and safety degree as evaluation indices, and information entropy to alter the influence coefficient of the multi-objective function in the reward function. KAI-DDPG is proposed to address the low learning and training efficiency of KA-DDPG, using immune optimization to optimize the experience samples in the experience buffer pool. Performance indices of traditional path planning and the proposed techniques are compared on a gazebo simulation platform, and the results suggest that KAI-DDPG can mitigate the drawbacks of DDPG, such as a protracted training cycle and poor path planning technique, and can broaden the range of application.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have