Abstract

The use of neural networks and reinforcement learning has become increasingly popular in autonomous vehicle control. However, the opaqueness of the resulting control policies presents a significant barrier to deploying neural network-based control in autonomous vehicles. In this paper, we present a reinforcement learning based approach to autonomous vehicle longitudinal control, where the rule-based safety cages provide enhanced safety for the vehicle as well as weak supervision to the reinforcement learning agent. By guiding the agent to meaningful states and actions, this weak supervision improves the convergence during training and enhances the safety of the final trained policy. This rule-based supervisory controller has the further advantage of being fully interpretable, thereby enabling traditional validation and verification approaches to ensure the safety of the vehicle. We compare models with and without safety cages, as well as models with optimal and constrained model parameters, and show that the weak supervision consistently improves the safety of exploration, speed of convergence, and model performance. Additionally, we show that when the model parameters are constrained or sub-optimal, the safety cages can enable a model to learn a safe driving policy even when the model could not be trained to drive through reinforcement learning alone.

Highlights

  • Autonomous driving has gained significant attention within the automotive research community in recent years [1,2,3]

  • We demonstrate that by using the weak supervision from the safety cages during training, the shallow model which otherwise could not learn to drive can be enabled to learn to drive without collisions

  • We demonstrated that the interventions by the safety cages can be used to re-train the neural networks in a supervised learning approach, enabling the system to learn from its own mistakes and further making the controller more robust

Read more

Summary

Introduction

Autonomous driving has gained significant attention within the automotive research community in recent years [1,2,3]. While imitation learning based approaches have shown important progress in autonomous driving [27,28,29,30], they present limitations when deployed in environments beyond the training distribution [31]. These driving models relying on supervised techniques are often evaluated on performance metrics on pre-collected validation datasets [32], low prediction error on offline testing is not necessarily correlated with driving quality [33]. We focus on longitudinal control and extend on our previous work on RL-based longitudinal control in a highway driving environment [20]

Safety Cages
Reinforcement Learning
Deep Deterministic Policy Gradient
Highway Vehicle Following Use-Case
Training
Results
Naturalistic Testing
Adversarial Testing
Conclusions

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.