Abstract

Driver Monitoring System (DMS), usually equipped with a camera, is an emerging vehicle safety system that can monitor driver attentiveness and trigger timely alarms when signs of inattention are detected. Since a single indicator (e.g., eye blink rate) is insufficient and unreliable to analyze driver attentiveness, almost all existing solutions train several independent models to identify driver facial states, such as face landmark, head pose, yawning, eye state, etc. However, apart from neglecting the inherent correlations between these related tasks, multiple models also raise challenges for vehicle safety-critical systems (e.g., hardware resources, software compatibility, and real-time response). In this paper, we propose a multi-task learning CNN framework (DANet) to unify the relevant tasks into one model and simultaneously output various driver facial states. By sharing the common features and parameters of highly related tasks, DANet avoids repetitive computations and mitigates single task overfitting. More importantly, the model provides a comprehensive overview of facial states while maintaining low complexity. We also propose two novel designs: (1) Dual-loss Block, which decomposes the pose estimation task into pose classification and coarse-to-fine regression; (2) Head Pose Penalization, which constrains the network to predict gaze direction based on predicted head pose. Our method achieves compelling results in both speed and accuracy on a vehicle computing platform, marking a momentous step in this field.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call