Driver's visual fixation attention prediction in dynamic scenes using hybrid neural networks

Chuan Xu,Han Liu,Qinghao Li,Yan Su

doi:10.1016/j.dsp.2023.104217

Abstract

The traffic driving scenario is dynamically varying and experienced drivers can allocate visual attention effectively in brief time intervals, focusing on prominent targets and areas in advance, thus ensuring safe driving. Therefore, it is significant to research the behavior of driver's visual attention allocation in the development of driver assistance systems and autonomous vehicles. Recently, inspired by top-down and bottom-up attention mechanisms, most visual attention models based on neural networks have been proposed. However, these models tend to yield highly distributed predictions for the specific driving scenario, and have relatively low prediction accuracy, making them unfeasible for being used in practice. Hence, to overcome these challenges, we propose a driver's visual attention model (DVAM) built on the encoder-decoder architecture with the utilization of the convolutional neural network (CNN) and the recurrent neural networks (RNN). Specifically, the model leverages the temporal and spatial dimension information to better mimic realistic dynamic driving, and extract abundant feature information, thus ensuring that the final predictions are authentic and effective. Then, extensive experiments demonstrate that our suggested DVAM achieves better robustness and precision in predicting driver attention to areas or targets compared to state-of-the-art methods on the DR(eye)VE dataset. Finally, our model is deployed on the TDV dataset to validate its generalization capabilities, which prove the potential for field adaptation.

Full Text