Abstract

The prediction of the driver’s focus of attention (DFoA) is becoming essential research for the driver distraction detection and intelligent vehicle. Therefore, this work makes an attempt to predict DFoA. However, traffic driving environment is a complex and dynamic changing scene. The existing methods lack full utilization of driving scene information and ignore the importance of different objects or regions of the driving scene. To alleviate this, we propose a multimodal deep neural network based on anthropomorphic attention mechanism and prior knowledge (MDNN-AAM-PK). Specifically, a more comprehensive information of driving scene (RGB images, semantic images, optical flow images and depth images of successive frames) is as the input of MDNN-AAM-PK. An anthropomorphic attention mechanism is developed to calculate the importance of each pixel in the driving scene. A graph attention network is adopted to learn semantic context features. The convolutional long short-term memory network (ConvLSTM) is used to achieve the transition of fused features in successive frames. Furthermore, a training method based on prior knowledge is designed to improve the efficiency of training and the performance of DFoA prediction. These experiments, including experimental comparison with the state-of-the-art methods, the ablation study of the proposed method, the evaluation on different datasets and the visual assessment experiment in vehicle simulation platform, show that the proposed method can accurately predict DFoA and is better than the state-of-the-art methods.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call