Abstract

Driver attention modeling is a crucial technique in building human-centric intelligent driving systems. Considering the human visual mechanism, this study leverages multi-level visual content, including low-level texture features, middle-level optical flows, and high-level semantic information, as the model input. Subsequently, a heterogeneous model is proposed to handle the multi-level input, which integrates the graph and convolutional neural networks. Distinguished from the existing studies that use semantic segmentation, our study directly leverages the objection detection information in an interpretable manner. To deal with the detected objects, in this work, a graph attention network is used to explicitly construct the semantic information, rather than handle the features extracted by convolutional modules for building the latent space features, which are used in existing studies. Further, a semantic attention module is proposed to integrate the non-Euclidean output of the graph network with the Euclidean feature maps of the convolutional neural networks. Finally, these integrated features are decoded to generate a driver attention map. Three typical datasets are used to validate the proposed method. A comprehensive comparison and analysis have proven the feasibility and validity of our proposed method, as well as its ability to achieve state-of-the-art performance.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call