Abstract

High-frequency gaze tracking demonstrates significant potential in various critical applications, such as foveated rendering, gaze-based identity verification, and the diagnosis of mental disorders. However, existing eye-tracking systems based on CCD/CMOS cameras either provide tracking frequencies below 200 Hz or employ high-speed cameras, leading to substantial power consumption and increased device volume. While there have been some high-speed eye-tracking datasets and methods based on event cameras, they are primarily tailored for near-eye camera scenarios. They lack the advantages associated with remote camera scenarios, such as the absence of the need for direct contact, enhanced user comfort and convenience, and the freedom of head pose in natural environments. In this work, we present RGBE-Gaze, the first large-scale and multimodal dataset for remote gaze tracking in high-frequency through synchronizing RGB and event cameras. This dataset is collected from 66 participants with diverse genders and age groups. The custom hybrid RGB-Event camera setup is leveraged to collect 3.6 million full-face high spatial resolution RGB images and 26.3 billion high temporal resolution event samples. Additionally, the dataset includes 10.7 million gaze references from the Gazepoint GP3 HD eye tracker and 15,972 sparse points of gaze (PoG) ground truth obtained through manual stimuli clicks by participants. We also present the dataset distribution across characteristics such as head pose and distance, gaze direction, pupil size, and tracking frequency. Furthermore, we introduce a hybrid frame-event based gaze estimation method specifically designed for the collected dataset. Moreover, we perform extensive evaluations of this method along with existing frame-based gaze estimation methods under various gaze-related factors, including different subjects, gaze directions, head poses, head distances, and pupil diameters. The evaluation results illustrate that introducing event stream as a new modality into the dataset improves gaze tracking frequency and demonstrates greater estimation robustness across diverse gaze-related factors.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.