Spatio-Temporal Adaptation With Dilated Neighbourhood Attention For Accident Anticipation

  • Abstract
  • Literature Map
  • Similar Papers
Abstract
Translate article icon Translate Article Star icon

Anticipating traffic accidents, which involves predicting potential traffic accidents in advance, is crucial for autonomous vehicles. In this study, we introduce a novel approach that utilises Spatial and Temporal Adapters, specifically designed for image-to-video adaptation through parameter-efficient transfer learning (PEFTL) in the context of traffic accident anticipation. To fully leverage the knowledge from a pretrained CLIP Vision Transformer (CLIP-ViT), the proposed architecture incorporates lightweight Dilated Neighbourhood Attention (DNA) within Adapters. Furthermore, DNA is integrated with a cross-attention mechanism in the Temporal Adapter to capture long-range temporal dependencies. The combination of these Adapters significantly enhances spatio-temporal adaptation, addressing the limitations of existing methods in accurately identifying accident-prone areas while achieving the earliness of accident anticipation in an end-to-end manner. Extensive experiments conducted on two widespread benchmark datasets, DAD and CCD, demonstrate notable performance improvements compared to state-of-the-art works.

Save Icon
Up Arrow
Open/Close
Notes

Save Important notes in documents

Highlight text to save as a note, or write notes directly

You can also access these Documents in Paperpal, our AI writing tool

Powered by our AI Writing Assistant