Abstract
In this paper, we propose a novel Deep Reinforcement Learning (DRL) algorithm which can navigate non-holonomic robots with continuous control in an unknown dynamic environment with moving obstacles. We call the approach MK-A3C (Memory and Knowledge-based Asynchronous Advantage Actor-Critic) for short. As its first component, MK-A3C builds a GRU-based memory neural network to enhance the robot’s capability for temporal reasoning. Robots without it tend to suffer from a lack of rationality in face of incomplete and noisy estimations for complex environments. Additionally, robots with certain memory ability endowed by MK-A3C can avoid local minima traps by estimating the environmental model. Secondly, MK-A3C combines the domain knowledge-based reward function and the transfer learning-based training task architecture, which can solve the non-convergence policies problems caused by sparse reward. These improvements of MK-A3C can efficiently navigate robots in unknown dynamic environments, and satisfy kinetic constraints while handling moving objects. Simulation experiments show that compared with existing methods, MK-A3C can realize successful robotic navigation in unknown and challenging environments by outputting continuous acceleration commands.
Highlights
Autonomous navigation plays an important role in the fields of video games [1] and robotics [2], to generate reasonable trajectories for Non-Player Characters (NPCs) and meet the fundamental needs of mobility for real-life robots
Robots with certain memory ability endowed by MK-A3C can avoid local minima traps by estimating the environmental model
The targeted navigation problem in this paper can be described as follows: In an initially unknown environment where terrains like dead corners and long corridors are widely distributed, a mobile robot only equipped with sparse local distance sensors is required to move safely to a given destination controlled by continuous control commands
Summary
Autonomous navigation plays an important role in the fields of video games [1] and robotics [2], to generate reasonable trajectories for Non-Player Characters (NPCs) and meet the fundamental needs of mobility for real-life robots. This paper focuses on the navigation problem of non-holonomic robots with continuous motion control in unknown dynamic environments. Algorithms from the first category usually conduct efficient path search in a heuristic or non-heuristic manner knowing precise environmental states. They include Simple Subgoal Graph (SSG) [4] and its variants [5], sampling-based path planning algorithms (such as Probabilistic RoadMaps (PRM) [6], Rapidly exploring Random Trees (RRT) [7]), etc. The sampling-based algorithms commonly used in robotics are complete and efficient. These methods need a model of the entire map often represented by grids or topology maps
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.