Abstract

In recent years, there has been an increased interest in giving verbal commands to self-driving cars. Even though multiple companies have showcased progress towards fully autonomous vehicles, surveys have indicated that people are wary of relinquishing total control of the vehicle to the AI. Thus, a system allowing passengers to control the vehicle’s actions would be preferable. Natural language, the most widespread form of communication among humans, presents itself as the most natural control interface, and survey results confirm that the ability to give verbal commands to self-driving vehicles would make the passengers more at ease. In this work, we propose a novel system that predicts which object is referred to by the issued command and the path the car should follow through the immediate surroundings to execute the command. We experiment with different approaches and features to predict the object of interest and show that our simple but effective approach achieves state-of-the-art performance. For predicting the trajectory, we propose a model that relies on a mixture density approach for modeling the distributions of key waypoints of the trajectory in the top-down scene layout. Additionally, we investigate the influence of the two tasks on each other and show that improvements in the prediction of the object of interest lead to improvements in the trajectory prediction task. Finally, we provide the research community with an extension to the Talk2Car dataset, with new trajectory annotations for given commands.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.