Abstract

AbstractThe prediction of surrounding agent trajectories in heterogeneous traffic environments remains a challenging task for autonomous driving due to several critical issues, such as understanding social interactions among agents and the environment, handling multiclass traffic movements, and generating feasible trajectories in accordance with real‐world rules, all of which hinder prediction accuracy. To address these issues, a new multimodal trajectory prediction framework based on the transformer network is presented in this study. A hierarchical‐structured context‐aware module, inspired by human perceptual logic, is proposed to capture contextual information within the scene. An efficient linear global attention mechanism is also proposed to reduce the computation and memory load of the transformer framework. Additionally, this study introduces a novel auxiliary loss to penalize infeasible off‐road predictions. Empirical results on the Lyft l5kit data set demonstrate the state‐of‐the‐art performance of the proposed model, which substantially enhances the accuracy and feasibility of prediction outcomes. The proposed model also possesses a unique feature, effectively dealing with missing input observations. This study underscores the importance of comprehending social interactions among agents and the environment, handling multiclass traffic movements, and generating feasible trajectories adhering to real‐world rules in autonomous driving.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call