Abstract

In this article, we propose a deep reinforcement learning (DRL) algorithm as well as a novel tailored neural network architecture for mobile robots to learn navigation policies autonomously. We first introduce a new feature extractor to better acquire critical spatiotemporal features from raw depth images. In addition, we present a double-source scheme so that the experiences are collected from both the proposed model and a conventional planner alternatively based on a switching criterion to provide more diverse and comprehensive samples for learning. Moreover, we also propose a dual-soft-actor–critic architecture to train two sets of networks with different purposes simultaneously. Specifically, the primary network aims to learn the autonomous navigation policy, while the secondary network aims to learn the depth feature extractor. In this way, the learning performance can be improved through decoupling representation learning from policy learning and training the feature extractor separately with more specific goals. Experimental results have demonstrated the remarkable performance of the proposed model. It outperforms both the conventional and the state-of-the-art DRL-based methods consistently in terms of success rate. Meanwhile, it also exhibits higher trajectory quality and better generalization capability compared to existing DRL-based methods. Videos of our experiments are available at <uri xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">https://youtu.be/evjU6bOU3UY</uri> or OneDrive.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call