Abstract

Collision avoidance algorithms are essential for safe and efficient robot operation among pedestrians. This work proposes using deep reinforcement (RL) learning as a framework to model the complex interactions and cooperation with nearby, decision-making agents, such as pedestrians and other robots. Existing RL-based works assume homogeneity of agent properties, use specific motion models over short timescales, or lack a principled method to handle a large, possibly varying number of agents. Therefore, this work develops an algorithm that learns collision avoidance among a variety of heterogeneous, non-communicating, dynamic agents without assuming they follow any particular behavior rules. It extends our previous work by introducing a strategy using Long Short-Term Memory (LSTM) that enables the algorithm to use observations of an arbitrary number of other agents, instead of a small, fixed number of neighbors. The proposed algorithm is shown to outperform a classical collision avoidance algorithm, another deep RL-based algorithm, and scales with the number of agents better (fewer collisions, shorter time to goal) than our previously published learning-based approach. Analysis of the LSTM provides insights into how observations of nearby agents affect the hidden state and quantifies the performance impact of various agent ordering heuristics. The learned policy generalizes to several applications beyond the training scenarios: formation control (arrangement into letters), demonstrations on a fleet of four multirotors and on a fully autonomous robotic vehicle capable of traveling at human walking speed among pedestrians.

Highlights

  • A fundamental challenge in autonomous vehicle operation is to safely negotiate interactions with other dynamic agents in the environment

  • This work is based on [15]–[17] and extends them as follows: (i) expanded discussion and example of the limitations of the prior work, (ii) further explanation of the proposed algorithm, including pseudo-code, (iii) analysis on the effect of sequence ordering in Long Short-Term Memory (LSTM), which addresses a primary gap in the prior work, (iv) quantifying input gate activation to provide deeper intuition on why the proposed use of LSTM works, (v) additional comparisons to model- and learning-based collision avoidance algorithms, (vi) ablation study of the proposed algorithm, and (vii) experiments with formation control and on real multirotors to demonstrate generalizability of the learned policy

  • In addition to not capturing decision making behavior of other agents, our experiments suggest that t is a crucial parameter to ensure convergence while training the deep neural network (DNN) in the previous algorithms

Read more

Summary

INTRODUCTION

A fundamental challenge in autonomous vehicle operation is to safely negotiate interactions with other dynamic agents in the environment. This work instead uses an idea from Natural Language Processing [12], [13] to encode the varying size state of the world (e.g., positions of other agents) into a fixed-length vector, using long short-term memory (LSTM) [14] cells at the network input. This enables the algorithm to make decisions based on an arbitrary number of other agents in the robot’s vicinity. This work is based on [15]–[17] and extends them as follows: (i) expanded discussion and example of the limitations of the prior work, (ii) further explanation of the proposed algorithm, including pseudo-code, (iii) analysis on the effect of sequence ordering in LSTM, which addresses a primary gap in the prior work, (iv) quantifying input gate activation to provide deeper intuition on why the proposed use of LSTM works, (v) additional comparisons to model- and learning-based collision avoidance algorithms, (vi) ablation study of the proposed algorithm, and (vii) experiments with formation control and on real multirotors to demonstrate generalizability of the learned policy

PROBLEM FORMULATION
RELATED WORKS USING LEARNING
HANDLING A VARIABLE NUMBER OF AGENTS
RESULTS
CONCLUSION
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.