Abstract

Head pose and eye gaze are vital clues for analysing a driver’s visual attention. Previous approaches achieve promising results from point clouds in constrained conditions. However, these approaches face challenges in the complex naturalistic driving scene. One of the challenges is that the collected point cloud data under non-uniform illumination and large head rotation is prone to partial facial occlusion. It causes bad transformation during failed template matching or incorrect feature extraction. In this paper, a novel estimation method is proposed for predicting accurate driver head pose and gaze zone using an RGB-D camera, with an effective point cloud fusion and registration strategy. In the fusion step, to reduce bad transformation, continuous multi-frame point clouds are registered and fused to generate a stable point cloud. In the registration step, to reduce reliance on template registration, multiple point clouds in the nearest neighbor gaze zone are utilized as a template point cloud. A coarse transformation computed by the normal distributions transform is used as the initial transformation, and updated with particle filter. A gaze zone estimator is trained by combining the head pose and eye image features, in which the head pose is predicted by point cloud registration, and the eye image features are extracted via multi-scale spare coding. Extensive experiments demonstrate that the proposed strategy achieves better results on head pose tracking, and also has a low error on gaze zone classification.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call