Abstract

Appearance-based gaze estimation provides relatively unconstrained gaze tracking. However, subject-independent models achieve limited accuracy partly due to individual variations. To improve estimation, we propose a gaze decomposition method that enables low complexity calibration, i.e., using calibration data collected when subjects view only one or a few gaze targets and the number of images per gaze target is small. Lowering the complexity of calibration makes it more convenient and less timeconsuming for the user, and more widely applicable. Motivated by our finding that the inter-subject squared bias exceeds the intra-subject variance for a subject-independent estimator, we decompose the gaze estimate into the sum of a subject-independent term estimated from the input image by a deep convolutional network and a subject-dependent bias term. During training, both the weights of the deep network and the bias terms are estimated. During testing, if no calibration data is available, we can set the bias term to zero. Otherwise, the bias term can be estimated from images of the subject gazing at known gaze targets. Experimental results on three datasets show that without calibration, our method outperforms state-of-the-art by at least 6.3%. For low complexity calibration sets, our method outperforms other calibration methods. More complex calibration algorithms do not outperform our method until the size of the calibration set is excessively large. Even then, the gains obtained by alternatives are small, e.g., only 0.1° lower error for 64 gaze targets. Source code is available at https://github.com/czk32611/Gaze-Decomposition.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call