Toward Observation Based Least Restrictive Collision Avoidance Using Deep Meta Reinforcement Learning

Salar Asayesh,Mo Chen,Kamal Gupta,Mehran Mehrandezh

doi:10.1109/lra.2021.3098332

Abstract

This letter presents the Observation-based Least-Restrictive Collision Avoidance Module (OLR-CAM) that can be added to any autonomous robot working in a shared environment and provide a high-level safety layer to the existing policy for each robot. The OLR-CAM takes raw sensory observations as input, evaluates the agents' safety against dynamic and static obstacles, and only intervenes the default policy when needed - in a least-restrictive fashion - to avoid a potential collision. In our approach, we meta-train the OLR-CAM policy within a “2D Navigation Meta World System”. Furthermore, to endow the policy with a notion of safety in multi-agent environments with obstacles, we propose a novel reward function based on a safety value function derived from the Hamilton-Jacobi reachability theory and a local cost map. The proposed reward function does not need any additional information about the environment's map. This facilitates the adoption of the algorithm in a new environment at the meta test stage. The proposed algorithm is fully meta-trained in simulation and tested on a real multi-agent system without any additional training conducted in the real setting. Our results show that the OLR-CAM success rate outperforms a well-known classical baseline approach by 10 percent on average and reduces the interruptions/changes to the preferred velocity by 15 percent.

Full Text