Abstract

Fast and accurate human intention prediction can significantly advance the performance of assistive devices for patients with limited motor or communication abilities. Among available modalities, eye movement can be valuable for inferring the user's intention, as it can be tracked non-invasively. However, existing limited studies in this domain do not provide the level of accuracy required for the reliable operation of assistive systems. By taking a data-driven approach, this paper presents a new framework that utilizes the spatial and temporal patterns of eye movement along with deep learning to predict the user's intention. In the proposed framework, the spatial patterns of gaze are identified by clustering the gaze points based on their density over displayed images in order to find the regions of interest (ROIs). The temporal patterns of gaze are identified via hidden Markov models (HMMs) to find the transition sequence between ROIs. Transfer learning is utilized to identify the objects of interest in the displayed images. Finally, models are developed to predict the user's intention after completing the task as well as at early stages of the task. The proposed framework is evaluated in an experiment involving predicting intended daily-life activities. Results indicate that an average classification accuracy of 97.42% is achieved, which is considerably higher than existing gaze-based intention prediction studies.

Highlights

  • W HEN humans interact with their surroundings, their behavior offers cues regarding their intention

  • In the current work, we present a new framework capable of both intention prediction as well as “early” intention prediction; we provide an extensive evaluation of the proposed framework considering various classifiers and larger number of subjects; we utilize an approach based on using hidden Markov models (HMMs) to perform visual behavior analysis of subjects in order to determine the commonly-used sequences of object selection for each task to enable early intention prediction; and we present an analysis of subjects’ gaze patterns to investigate the individual preferences and differences

  • For other intended tasks that involve four objects, the clustering algorithm has detected four regions of interest (ROIs) in more than 60% of their corresponding trials. These results further confirm our assumption that when the number of ROIs is larger than two, the corresponding trial can be associated with intended tasks and not the unintended scenario, and that the number of detected ROIs from the clustering algorithm can be used to successfully differentiate unintended and intended tasks

Read more

Summary

Introduction

W HEN humans interact with their surroundings, their behavior offers cues regarding their intention. These cues can be observed through gestures, speech, or movements of their eyes. In human-machine interfaces (HMIs), predicting human intention advances machines’ intelligence. In HMIs, various modalities are being utilized to predict the user’s intention. Hand position and orientation have been utilized in [3] to predict various activities such as pouring a bottle or drinking water. Electroencephalography (EEG) signals have been shown to be valuable for predicting movement and gait intentions [4]–[6] and emotions [7]

Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.