Abstract

Gaze estimation is one of the most promising technologies for supporting indoor monitoring and interaction systems. However, previous gaze estimation techniques generally work only in a controlled laboratory environment because they require a number of high-resolution eye images. This makes them unsuitable for welfare and healthcare facilities with the following challenging characteristics: 1) users’ continuous movements, 2) various lighting conditions, and 3) a limited amount of available data. To address these issues, we introduce a multi-view multi-modal head-gaze estimation system that translates the user’s head orientation into the gaze direction. The proposed system captures the user using multiple cameras with depth and infrared modalities to train more robust gaze estimators under the aforementioned conditions. To this end, we implemented a deep learning pipeline that can handle different types and combinations of data. The proposed system was evaluated using the data collected from 10 volunteer participants to analyze how the use of single/multiple cameras and modalities affect the performance of head-gaze estimators. Through various experiments, we found that 1) an infrared-modality provides more useful features than a depth-modality, 2) multi-view multi-modal approaches provide better accuracy than single-view single-modal approaches, and 3) the proposed estimators achieve a high inference efficiency that can be used in real-time applications.

Highlights

  • Since the outbreak of COVID-19, governmental bodies worldwide announced that, to limit the spread of the virus, noncontact interactions should be established between people [1]

  • We proposed a head-gaze estimation system based on a multi-view multi-modal approach for monitoring and interacting with a user in an indoor environment

  • The proposed approach captures the users from multiple views using cameras with depth and infrared modalities for head-gaze estimation under the following challenging conditions: 1) indoor interaction should be considered, 2) various lighting conditions must be handled, and 3) only a limited data size is available

Read more

Summary

Introduction

Since the outbreak of COVID-19, governmental bodies worldwide announced that, to limit the spread of the virus, noncontact interactions should be established between people [1]. Noncontact interaction is essential for medically vulnerable people (e.g., the elderly, patients, or the disabled, as shown in Fig. 1) who can be infected by the virus [2]. Medical welfare facilities (e.g., hospitals and nursing homes) have begun to limit face-to-face care and contact visits [3]. Non-contact interaction will be effective in preventing the medically vulnerable from contracting the virus, it makes continuous monitoring of their statuses and conditions challenging. It is essential to develop more effective and advanced noncontact interaction techniques to monitor and predict the health status, intentions, and behaviors of people in the welfare domain

Objectives
Methods
Findings
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call