Gaze Estimation Method Combining Facial Feature Extractor with Pyramid Squeeze Attention Mechanism

Jingfang Wei,Yuji Iwahori,Xiaoyu Yu,Haibin Wu,Aili Wang,Qing Wu

doi:10.3390/electronics12143104

Abstract

To address the issue of reduced gaze estimation accuracy caused by individual differences in different environments, this study proposes a novel gaze estimation algorithm based on attention mechanisms. Firstly, by constructing a facial feature extractor (FFE), the method obtains facial feature information about the eyes and locates the feature areas of the left and right eyes. Then, the L2CSNet (l2 loss + cross-entropy loss + softmax layer network), which integrates the PSA (pyramid squeeze attention), is designed to increase the correlation weights related to gaze estimation in the feature areas, suppress other irrelevant weights, and extract more fine-grained feature information to obtain gaze direction features. Finally, by integrating L2CSNet with FFE and PSA, FPSA_L2CSNet was proposed, which is fully tested on four representative publicly available datasets and a real-world dataset comprising individuals of different backgrounds, lighting conditions, nationalities, skin tones, ages, genders, and partial occlusions. The experimental results indicate that the accuracy of the gaze estimation model proposed in this paper has been improved by 13.88%, 11.43%, and 7.34%, compared with L2CSNet, FSE_L2CSNet, and FCBA_L2CSNet, respectively. This model not only improves the robustness of gaze estimation but also provides more accurate estimation results than the original model.

Full Text