Abstract

The study of gaze behavior has primarily been constrained to controlled environments in which the head is fixed. Consequently, little effort has been invested in the development of algorithms for the categorization of gaze events (e.g. fixations, pursuits, saccade, gaze shifts) while the head is free, and thus contributes to the velocity signals upon which classification algorithms typically operate. Our approach was to collect a novel, naturalistic, and multimodal dataset of eye + head movements when subjects performed everyday tasks while wearing a mobile eye tracker equipped with an inertial measurement unit and a 3D stereo camera. This Gaze-in-the-Wild dataset (GW) includes eye + head rotational velocities (deg/s), infrared eye images and scene imagery (RGB + D). A portion was labelled by coders into gaze motion events with a mutual agreement of 0.74 sample based Cohen’s κ. This labelled data was used to train and evaluate two machine learning algorithms, Random Forest and a Recurrent Neural Network model, for gaze event classification. Assessment involved the application of established and novel event based performance metrics. Classifiers achieve ~87% human performance in detecting fixations and saccades but fall short (50%) on detecting pursuit movements. Moreover, pursuit classification is far worse in the absence of head movement information. A subsequent analysis of feature significance in our best performing model revealed that classification can be done using only the magnitudes of eye and head movements, potentially removing the need for calibration between the head and eye tracking systems. The GW dataset, trained classifiers and evaluation metrics will be made publicly available with the intention of facilitating growth in the emerging area of head-free gaze event classification.

Highlights

  • The study of gaze behavior has primarily been constrained to controlled environments in which the head is fixed

  • Gaze-in-the-Wild dataset (GW) was collected from 19 participants engaged in everyday activities using spatially and temporally calibrated equipment comprised of a hardhat with an inertial measurement unit (IMU), eye tracking glasses, and a stereo-based RGB-D (RGB imagery plus depth) sensor

  • The main purpose of this work was to build the first dataset of labelled gaze movements collected during natural behavior ‘in the wild’, to have multiple labellers manually label the gaze events in the dataset, and to showcase the performance of two standard temporal classification techniques, Random Forest and Recurrent Neural Networks, using some common evaluation metrics

Read more

Summary

Introduction

The study of gaze behavior has primarily been constrained to controlled environments in which the head is fixed. Our approach was to collect a novel, naturalistic, and multimodal dataset of eye + head movements when subjects performed everyday tasks while wearing a mobile eye tracker equipped with an inertial measurement unit and a 3D stereo camera. This Gaze-in-the-Wild dataset (GW) includes eye + head rotational velocities (deg/s), infrared eye images and scene imagery (RGB + D). Larsson et al used a head-mounted IMU in a study where subjects were asked to perform visual tracking tasks when watching pre-rendered stimuli projected onto a 2D screen[16] They established that compensating for head movements results in a reduced standard deviation for the eye position signal. Solutions which fuse IMU pose estimates with head tracking based on egocentric video are promising, they have not yet been adopted in the context of eye tracking, and to do so is beyond the scope of the current work

Objectives
Results
Discussion
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.