Abstract
For collision-free navigation in unstructured and cluttered environments, deep reinforcement learning (DRL) has gained extensive successes for being capable of adapting to new environments without much human effort. However, due to its asymmetry, the problems related to its lack of data efficiency and robustness remain as challenges. In this paper, we present a new laser-based navigation system for mobile robots, which combines a global planner with reinforcement learning-based local trajectory re-planning. The proposed method uses Proximal Policy Optimization to learn an efficient and robust local planning policy with asynchronous data generation and training. Extensive experiments have been presented, showing that the proposed system achieves better performance than previous methods including end-to-end DRL, and it can improve the asymmetrical performance. Our analysis show that the proposed method can efficiently avoid deadlock points and achieves a higher success rate. Moreover, we show that our system can generalize to unseen environments and obstacles with only a few shots. The model enables the warehouse to realize automatic management through intelligent sorting and handling, and it is suitable for various customized application scenarios.
Highlights
A basic skill for a mobile robot to serve human society is to navigate to a required location without collision
Our investigation shows that end-to-end deep reinforcement learning (DRL) is prone to being trapped in so-called deadlock points, where the cost function lies in local minimum and the robot can hardly explore a way out
We show that with the global planner, the robot is less likely to be trapped in deadlock points, which yields a higher success rate of reaching the desired goal
Summary
A basic skill for a mobile robot to serve human society is to navigate to a required location without collision. Navigation based on merely simple sensors such as lasers can be quite challenging, especially in cluttered environments. Traditional methods in this topic combine planning methods with artificially designed rules and models [1,2,3]. Those methods requires sedulous calculation, which is typically designed for specific robots and structured obstacles. Several challenges remain: First, mapless end-to-end DRL navigator requires a huge amount of data for training, and generalization to new scenarios can be costly. Widely used off-policy DRL algorithms such as deep deterministic policy gradients (DDPG) [8]
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.