Abstract

The variable head pose and low-quality eye images in natural scenes can lead to low accuracy of gaze estimation. In this paper, we propose a multi-feature fusion gaze estimation model based on the attention mechanism. First, face and eye feature extractors based on the group convolution channel and spatial attention mechanism (GCCSAM) are designed to use channel and spatial information to adaptively select and enhance important features in face images and two eye images, and suppress information irrelevant to gaze estimation. Then we design two feature fusion networks to fuse the features of face, two eyes and pupil center position, thus avoiding the effects of two-eye asymmetry and inaccurate head pose estimation on gaze estimation. The average angular error of the proposed method is 4.1° on MPIIGaze and 5.2° on EyeDiap. Compared with the current mainstream methods, our method effectively improves the accuracy and robustness of gaze estimation in natural scenes.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.