Abstract

A study is presented on applying deep reinforcement learning (DRL) for visual navigation of wheeled mobile robots (WMR) in dynamic and unknown environments. Two DRL algorithms, namely, value-learning deep Q-network (DQN) and policy gradient based asynchronous advantage actor critic ( $A$ 3C), have been considered. RGB (red, green and blue) and depth images have been used as inputs in implementation of both DRL algorithms to generate control commands for autonomous navigation of WMR in simulation environments. The initial DRL networks were generated and trained progressively in OpenAI Gym Gazebo based simulation environments within robot operating system (ROS) framework for a popular target WMR, Kobuki TurtleBot2. A pre-trained deep neural network ResNet50 was used after further training with regrouped objects commonly found in laboratory setting for target-driven mapless visual navigation of Turlebot2 through DRL. The performance of $A$ 3C with multiple computation threads (4, 6, and 8) was simulated on a desktop. The navigation performance of DQN and $A$ 3C networks, in terms of reward statistics and completion time, was compared in three simulation environments. As expected, $A$ 3C with multiple threads (4, 6, and 8) performed better than DQN and the performance of $A$ 3C improved with number of threads. Details of the methodology, simulation results are presented and recommendations for future work towards real-time implementation through transfer learning of the DRL models are outlined.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call