Dynamic attention augmented graph network for video accident anticipation

Wenfeng Song,Shuai Li,Tao Chang,Aimin Hao,Ke Xie,Hong Qin

doi:10.1016/j.patcog.2023.110071

Abstract

Accident anticipation (or the prediction of abnormal events in general) aims to forecast accidents before they occur by assessing risks based on the preceding frames in videos. The risk assessment heavily relies on understanding the semantics of the scene context and predicting the interactions among the involved subjects. Indeed, the comprehensive utilization of spatial relationships among the subjects of immediate interest in a single frame and temporal dependencies across consecutive frames is crucial for video accident anticipation. To address this challenge, we propose a novel approach called Dynamic Attention Augmented Graph Network (DAA-GNN), which leverages underlying spatial cues and models’ relationships among detected subjects of immediate interest. Specifically, our approach employs a graph neural network that is enhanced by global context clues, allowing effective message propagation and the discovery of interactions among the subjects of interest in the scene. The DAA-GNN includes a temporal attention module designed to identify long-term dependencies along the temporal axis, contributing to an end-to-end deep network solution for accurate accident anticipation. We extensively evaluate our method on the publicly available Dashcam Accident Dataset (DAD) and Epic Fail (EF) datasets, by conducting comprehensive experiments to assess its performance. The results unequivocally demonstrate that our method outperforms the state-of-the-art accident anticipation methods. Our source code and datasets are available at https://github.com/ZxyLinkstart/DAA-GNN.

Full Text