Exploring Contextual Representation and Multi-modality for End-to-end Autonomous Driving

Shoaib Azam,Shoaib Azam,Farzeen Munir,Farzeen Munir,Ville Kyrki,Ville Kyrki,Tomasz Piotr Kucner,Tomasz Piotr Kucner,Moongu Jeon,Witold Pedrycz,Witold Pedrycz,Witold Pedrycz

doi:10.1016/j.engappai.2024.108767

Abstract

Learning contextual and spatial environmental representations enhances autonomous vehicle’s hazard anticipation and decision-making in complex scenarios. Recent perception systems enhance spatial understanding with sensor fusion but often lack global environmental context. Humans, when driving, naturally employ neural maps that integrate various factors such as historical data, situational subtleties, and behavioral predictions of other road users to form a rich contextual understanding of their surroundings. This neural map-based comprehension is integral to making informed decisions on the road. In contrast, even with their significant advancements, autonomous systems have yet to fully harness this depth of human-like contextual understanding. Motivated by this, our work draws inspiration from human driving patterns and seeks to formalize the sensor fusion approach within an end-to-end autonomous driving framework. We introduce a framework that integrates three cameras (left, right, and center) to emulate the human field of view, coupled with top-down bird-eye-view semantic data to enhance contextual representation. The sensor data is fused and encoded using a self-attention mechanism, leading to an auto-regressive waypoint prediction module. We treat feature representation as a sequential problem, employing a vision transformer to distill the contextual interplay between sensor modalities. The efficacy of the proposed method is experimentally evaluated in both open and closed-loop settings. Our method achieves displacement error by 0.67m in open-loop settings, surpassing current methods by 6.9% on the nuScenes dataset. In closed-loop evaluations on CARLA’s Town05 Long and Longest6 benchmarks, the proposed method enhances driving performance, route completion, and reduces infractions.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Engineering Applications of Artificial Intelligence	Publication Date: Jun 10, 2024
Citations: 1	License type: cc-by

R Discovery Prime

R Discovery Prime

Exploring Contextual Representation and Multi-modality for End-to-end Autonomous Driving

Abstract

Talk to us

Similar Papers

More From: Engineering Applications of Artificial Intelligence

Lead the way for us

Similar Papers

An End-to-End Motion Planner Using Sensor Fusion for Autonomous Driving
Nguyen Thi Hoai Thu ... Dong Seog Han
-
Nguyen Thi Hoai Thu, et. al.Nguyen Thi Hoai Thu ... Dong Seog Han
20 Feb 2023
20 Feb 2023

Multi-Task Environmental Perception Methods for Autonomous Driving.
Ri Liu ... Yunchuan Yang
Sensors (Basel, Switzerland) | VOL. 24
Ri Liu, et. al.Ri Liu ... Yunchuan Yang
28 Aug 2024
Sensors (Basel, Switzerland) | VOL. 24

Named Entity Recognition in Persian Language based on Self-attention Mechanism with Weighted Relational Position Encoding
Ebrahim Ganjalipour ... Sohrab Kordrostami
ACM Transactions on Asian and Low-Resource Language Information Processing | VOL. 22
Ebrahim Ganjalipour, et. al.Ebrahim Ganjalipour ... Sohrab Kordrostami
19 Dec 2023
ACM Transactions on Asian and Low-Resource Language Information Processing | VOL. 22

Contextual representations increase analogue traumatic intrusions: Evidence against a dual-representation account of peri-traumatic processing
David G Pearson
Journal of Behavior Therapy and Experimental Psychiatry | VOL. 43
David G PearsonDavid G Pearson
21 Apr 2012
Journal of Behavior Therapy and Experimental Psychiatry | VOL. 43

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Exploring Contextual Representation and Multi-modality for End-to-end Autonomous Driving

Abstract

Talk to us

Similar Papers

More From: Engineering Applications of Artificial Intelligence