Abstract
Appearance-based gaze estimation has been widely studied recently with promising performance. The majority of appearance-based gaze estimation methods are developed under the deterministic frameworks. However, the deterministic gaze estimation methods suffer from large performance drop upon challenging eye images in low-resolution, darkness, partial occlusions, etc. To alleviate this problem, in this article, we alternatively reformulate the appearance-based gaze estimation problem under a generative framework. Specifically, we propose a variational inference model, that is, variational gaze estimation network (VGE-Net), to generate multiple gaze maps as complimentary candidates simultaneously supervised by the ground-truth gaze map. To achieve robust estimation, we adaptively fuse the gaze directions predicted on these candidate gaze maps by a regression network through a simple attention mechanism. Experiments on three benchmarks, that is, MPIIGaze, EYEDIAP, and Columbia, demonstrate that our VGE-Net outperforms state-of-the-art gaze estimation methods, especially on challenging cases. Comprehensive ablation studies also validate the effectiveness of our contributions. The code will be publicly released.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.