Abstract

In this work, we present a data-driven simulation and training engine capable of learning end-to-end autonomous vehicle control policies using only sparse rewards. By leveraging real, human-collected trajectories through an environment, we render novel training data that allows virtual agents to drive along a continuum of new local trajectories consistent with the road appearance and semantics, each with a different view of the scene. We demonstrate the ability of policies learned within our simulator to generalize to and navigate in previously unseen real-world roads, without access to any human control labels during training. Our results validate the learned policy onboard a full-scale autonomous vehicle, including in previously un-encountered scenarios, such as new roads and novel, complex, near-crash situations. Our methods are scalable, leverage reinforcement learning, and apply broadly to situations requiring effective perception and robust operation in the physical world.

Highlights

  • IntroductionE ND-TO-END (i.e., perception-to-control) trained neural networks for autonomous vehicles have shown great promise for lane stable driving [1]–[3]

  • E ND-TO-END trained neural networks for autonomous vehicles have shown great promise for lane stable driving [1]–[3]

  • Note that while we focus on data-driven simulation for lane-stable driving in this work, the presented approach is applicable to end-to-end navigation [3] learning by stitching together collected trajectories to learn through arbitrary intersection configurations

Read more

Summary

Introduction

E ND-TO-END (i.e., perception-to-control) trained neural networks for autonomous vehicles have shown great promise for lane stable driving [1]–[3] They lack methods to learn robust models at scale and require vast amounts of training data that are time consuming and expensive to collect. Learned end-to-end driving policies and modular perception components in a driving pipeline require capturing training data from all necessary edge cases, such as recovery from off-orientation positions or even near collisions. This is prohibitively expensive, and potentially dangerous [4]. This letter was recommended for publication by Associate Editor E.

Objectives
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.