Abstract

Trajectory prediction of heterogeneous road agents such as vehicles, cyclists, and pedestrians in dense traffic plays an essential role in self-driving. Despite breakthroughs in trajectory prediction technology in recent years, challenges remain in world state representation, social interaction modeling, real-time computing, and road agent heterogeneity. To address these challenges, we propose a new model that employs hierarchical convolutional networks and multi-task learning to predict agents' trajectories. The model first achieves effective and unified representation of agent and scene context by rendering heterogeneous world states in a top-down multi-channel raster map. Based on this representation, we propose hierarchical convolutional networks to extract global interaction and local features of all agents simultaneously, enabling the model to predict multiple agents' trajectories in real-time in a single forward inference. In addition, we specifically design multi-task learning branches with dynamic adaptive anchors to capture differences in behavioral patterns of heterogeneous agents, allowing a single model to accurately predict multimodal trajectories of multi-class agents. Extensive experiments on public nuScenes and Lyft datasets demonstrate top model performance. Importantly, our model is faster (2.2x) and more computationally stable than state-of-the-art models, making it well-suited for mass-produced self-driving systems that require both performance and computational efficiency.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call