Collision Avoidance in Pedestrian-Rich Environments With Deep Reinforcement Learning

Michael Everett,Yu Fan Chen,Jonathan P How

doi:10.1109/access.2021.3050338

Michael Everett, Yu Fan Chen + Show 1 more

Open Access

https://doi.org/10.1109/access.2021.3050338

Copy DOI

Abstract

Collision avoidance algorithms are essential for safe and efficient robot operation among pedestrians. This work proposes using deep reinforcement (RL) learning as a framework to model the complex interactions and cooperation with nearby, decision-making agents, such as pedestrians and other robots. Existing RL-based works assume homogeneity of agent properties, use specific motion models over short timescales, or lack a principled method to handle a large, possibly varying number of agents. Therefore, this work develops an algorithm that learns collision avoidance among a variety of heterogeneous, non-communicating, dynamic agents without assuming they follow any particular behavior rules. It extends our previous work by introducing a strategy using Long Short-Term Memory (LSTM) that enables the algorithm to use observations of an arbitrary number of other agents, instead of a small, fixed number of neighbors. The proposed algorithm is shown to outperform a classical collision avoidance algorithm, another deep RL-based algorithm, and scales with the number of agents better (fewer collisions, shorter time to goal) than our previously published learning-based approach. Analysis of the LSTM provides insights into how observations of nearby agents affect the hidden state and quantifies the performance impact of various agent ordering heuristics. The learned policy generalizes to several applications beyond the training scenarios: formation control (arrangement into letters), demonstrations on a fleet of four multirotors and on a fully autonomous robotic vehicle capable of traveling at human walking speed among pedestrians.

Highlights

A fundamental challenge in autonomous vehicle operation is to safely negotiate interactions with other dynamic agents in the environment
This work is based on [15]–[17] and extends them as follows: (i) expanded discussion and example of the limitations of the prior work, (ii) further explanation of the proposed algorithm, including pseudo-code, (iii) analysis on the effect of sequence ordering in Long Short-Term Memory (LSTM), which addresses a primary gap in the prior work, (iv) quantifying input gate activation to provide deeper intuition on why the proposed use of LSTM works, (v) additional comparisons to model- and learning-based collision avoidance algorithms, (vi) ablation study of the proposed algorithm, and (vii) experiments with formation control and on real multirotors to demonstrate generalizability of the learned policy
In addition to not capturing decision making behavior of other agents, our experiments suggest that t is a crucial parameter to ensure convergence while training the deep neural network (DNN) in the previous algorithms

Summary

INTRODUCTION

A fundamental challenge in autonomous vehicle operation is to safely negotiate interactions with other dynamic agents in the environment. This work instead uses an idea from Natural Language Processing [12], [13] to encode the varying size state of the world (e.g., positions of other agents) into a fixed-length vector, using long short-term memory (LSTM) [14] cells at the network input. This enables the algorithm to make decisions based on an arbitrary number of other agents in the robot’s vicinity. This work is based on [15]–[17] and extends them as follows: (i) expanded discussion and example of the limitations of the prior work, (ii) further explanation of the proposed algorithm, including pseudo-code, (iii) analysis on the effect of sequence ordering in LSTM, which addresses a primary gap in the prior work, (iv) quantifying input gate activation to provide deeper intuition on why the proposed use of LSTM works, (v) additional comparisons to model- and learning-based collision avoidance algorithms, (vi) ablation study of the proposed algorithm, and (vii) experiments with formation control and on real multirotors to demonstrate generalizability of the learned policy

PROBLEM FORMULATION

RELATED WORKS USING LEARNING

HANDLING A VARIABLE NUMBER OF AGENTS

RESULTS

CONCLUSION

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: IEEE Access	Publication Date: Oct 24, 2019
Citations: 144	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Collision Avoidance in Pedestrian-Rich Environments With Deep Reinforcement Learning

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE Access

Lead the way for us

Similar Papers

Motion Planning Among Dynamic, Decision-Making Agents with Deep Reinforcement Learning
Michael Everett ... Jonathan P How
-
Michael Everett, et. al.Michael Everett ... Jonathan P How
01 Oct 2018
01 Oct 2018

Formation Control with Collision Avoidance through Deep Reinforcement Learning
Zezhi Sui ... Tianyi Xiong
-
Zezhi Sui, et. al.Zezhi Sui ... Tianyi Xiong
01 Jul 2019
01 Jul 2019

Formation Control With Collision Avoidance Through Deep Reinforcement Learning Using Model-Guided Demonstration.
Zezhi Sui ... Shiguang Wu
IEEE Transactions on Neural Networks and Learning Systems | VOL. 32
Zezhi Sui, et. al.Zezhi Sui ... Shiguang Wu
16 Jul 2020
IEEE Transactions on Neural Networks and Learning Systems | VOL. 32

Automatic ship collision avoidance using deep reinforcement learning with LSTM in continuous action spaces
Ryohei Sawada ... Keiji Sato
Journal of Marine Science and Technology | VOL. 26
Ryohei Sawada, et. al.Ryohei Sawada ... Keiji Sato
03 Aug 2020
Journal of Marine Science and Technology | VOL. 26

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Collision Avoidance in Pedestrian-Rich Environments With Deep Reinforcement Learning

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE Access