Abstract

Accurately understanding traffic surroundings is crucial for various autonomous and assisted driving scenarios. The visual perception system must capture the entire scene, including vehicle positions, road conditions, and lane configurations. While existing methods co-train models for these tasks simultaneously, they overlook the topological relationships among roads, lanes, and traffic objects in images. In this paper, we propose leveraging inherent structural relations among these tasks to enhance precise panoptic driving perception. We introduce a cross-task relation mining (CRM) method to achieve this goal. Self-attention mechanisms are used to blend key spatial features within each task, and cross-attention facilitates essential information exchange between tasks, resulting in a more comprehensive scene interpretation. Extensive experiments demonstrate the effectiveness of our approach in complex traffic scenarios.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call