Most non-invasive gaze estimation methods do not consider the inter-individual differences in anatomical structure, but directly regress the gaze direction from the appearance image information, which limits the accuracy of individual-independent gaze estimation networks. In addition, existing gaze estimation methods tend to consider only how to improve the model’s generalization performance, ignoring the crucial issue of efficiency, which leads to bulky models that are difficult to deploy and have questionable cost-effectiveness in practical use. This paper makes the following contributions: (1) A differential network for gaze estimation using adaptive reference samples is proposed, which can adaptively select reference samples based on scene and individual characteristics. (2) The knowledge distillation is used to transfer the knowledge structure of robust teacher networks into lightweight networks so that our networks can execute quickly and at low computational cost, dramatically increasing the prospect and value of applying gaze estimation. (3) Integrating the above innovations, a novel fast differential neural network (Diff-Net) named FDAR-Net is constructed and achieved excellent results on MPIIGaze, UTMultiview and EyeDiap.
Read full abstract