Abstract

Driving in a dynamic, multi-agent, and complex urban environment is a difficult task requiring a complex decision-making policy. The learning of such a policy requires a state representation that can encode the entire environment. Mid-level representations that encode a vehicle’s environment as images have become a popular choice. Still, they are quite high-dimensional, limiting their use in data-hungry approaches such as reinforcement learning. In this article, we propose to learn a low-dimensional and rich latent representation of the environment by leveraging the knowledge of relevant semantic factors. To do this, we train an encoder-decoder deep neural network to predict multiple application-relevant factors such as the trajectories of other agents and the ego car. Furthermore, we propose a hazard signal based on other vehicles’ future trajectories and the planned route which is used in conjunction with the learned latent representation as input to a down-stream policy. We demonstrate that using the multi-head encoder-decoder neural network results in a more informative representation than a standard single-head model. In particular, the proposed representation learning and the hazard signal help reinforcement learning to learn faster, with increased performance and less data than baseline methods.

Highlights

  • D RIVING in unstructured and dynamic urban environments is an arduous task

  • Even the midlevel representations may be so high-dimensional that their use with data-hungry methods such as Reinforcement Learning (RL) is limited

  • The primary contributions of this work are: (a) a multitask network with auxiliary heads to improve the quality of low-dimensional representations, (b) a hazard signal calculated by the likelihood between route and predicted trajectories of dynamic agents, and (c) an experimental study of an RL policy learning, showing that the learned latent-vector by using auxiliary tasks and the hazard signal, can help the policy to be (i) trained faster, (ii) perform better, (iii) learn to solve the task using less data, (iv) and generalize better to new scenarios

Read more

Summary

INTRODUCTION

D RIVING in unstructured and dynamic urban environments is an arduous task. Many moving agents such as cars, bicycles, and pedestrians affect driver’s behavior and decisions. The primary contributions of this work are: (a) a multitask network with auxiliary heads to improve the quality of low-dimensional representations, (b) a hazard signal calculated by the likelihood between route and predicted trajectories of dynamic agents, and (c) an experimental study of an RL policy learning, showing that the learned latent-vector by using auxiliary tasks and the hazard signal, can help the policy to be (i) trained faster, (ii) perform better, (iii) learn to solve the task using less data, (iv) and generalize better to new scenarios.

RELATED WORK
Variational Auto-Encoder
Reinforcement Learning
Deep Q-Network
METHOD
Learning Latent Representation Using Multi-Head VAE
Generation of Hazard Signal
Policy Learning Using DQN
EXPERIMENTS
Simulation Environment and Data Collection
Implementation Details
Effect of Different Heads and the Hazard Signal
Comparison to Baselines
Method
Effect of the Dataset Size
Qualitative Analysis
Generalization Analysis
Findings
CONCLUSIONS
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.