Abstract

Autonomous mobile robots have become popular in various applications coexisting with humans, which requires robots to navigate efficiently and safely in crowd environments with diverse pedestrians. Pedestrians may cooperate with the robot by avoiding it actively or ignoring the robot during their walking, while some pedestrians, denoted as non-cooperators, may try to block the robot. It is also challenging to identify potential vulnerabilities of a navigation policy, i.e., situations that the robot may cause a collision, in various crowd environments, which reduces the reliability and the safety of the robot. In this paper, we propose a deep reinforcement learning (DRL) approach to train a policy simulating the behavior of non-cooperators, which can effectively identify vulnerabilities of a navigation policy. We evaluate the approach both on the Robot Operating System (ROS) navigation stack with dynamic window approach (DWA) and a DRL-based navigation policy, which identifies useful vulnerabilities of both navigation policies for further improvements. Moreover, these non-cooperators play a game with the DRL-based navigation policy, then we can improve the robustness of such navigation policy by retraining it in the sense of asymmetric self-play. We evaluate the retrained navigation policy in various crowd environments with diverse pedestrians. The experimental results show that the approach can improve the robustness of the navigation policy. The source code for the training and the simulation platform is released online at <uri xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">https://github.com/DRL-Navigation</uri> .

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call