Abstract

Traffic accident anticipation is a vital function of Automated Driving Systems (ADS) in providing a safety-guaranteed driving experience. An accident anticipation model aims to predict accidents promptly and accurately before they occur. Existing Artificial Intelligence (AI) models of accident anticipation lack a human-interpretable explanation of their decision making. Although these models perform well, they remain a black-box to the ADS users who find it to difficult to trust them. To this end, this paper presents a gated recurrent unit (GRU) network that learns spatio-temporal relational features for the early anticipation of traffic accidents from dashcam video data. A post-hoc attention mechanism named Grad-CAM (Gradient-weighted Class Activation Map) is integrated into the network to generate saliency maps as the visual explanation of the accident anticipation decision. An eye tracker captures human eye fixation points for generating human attention maps. The explainability of network-generated saliency maps is evaluated in comparison to human attention maps. Qualitative and quantitative results on a public crash data set confirm that the proposed explainable network can anticipate an accident on average 4.57 s before it occurs, with 94.02% average precision. Various post-hoc attention-based XAI methods are then evaluated and compared. This confirms that the Grad-CAM chosen by this study can generate high-quality, human-interpretable saliency maps (with 1.23 Normalized Scanpath Saliency) for explaining the crash anticipation decision. Importantly, results confirm that the proposed AI model, with a human-inspired design, can outperform humans in accident anticipation.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call