Abstract

E-learning has emerged as an indispensable educational mode in the post-epidemic era. However, this mode makes it difficult for students to stay engaged in learning without appropriate activity monitoring. Our work explores a promising solution that combines gaze and mouse data to recognize students' activities, thereby facilitating activity monitoring and analysis during e-learning. We initially surveyed 200 students from a local university, finding more acceptance for eye trackers and mouse loggers compared to video surveillance. We then designed eight students' routine digital activities to collect a multimodal dataset and analyze the patterns and correlations between gaze and mouse across various activities. Our proposed Joint Cross-Attention Fusion Net, a multimodal activity recognition framework, leverages the gaze-mouse relationship to yield improved classification performance by integrating cross-modal representations through a cross-attention mechanism and integrating the joint features that characterize gaze-mouse coordination. Evaluation results show that our method can achieve up to 94.87% F1-score in predicting 8-classes activities, with an improvement of at least 7.44% over using gaze or mouse data independently. This research illuminates new possibilities for monitoring student engagement in intelligent education systems, also suggesting a promising strategy for melding perception and action modalities in behavioral analysis across a range of ubiquitous computing environments.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call