Exploring collaborative caption editing to augment video-based learning.

Bhavya Bhavya,Chengxiang Zhai,Wenting Li,Lawrence Angrave,Si Chen,Yun Huang,Zhilin Zhang

doi:10.1007/s11423-022-10137-5

Bhavya Bhavya, Chengxiang Zhai + Show 5 more

Open Access

https://doi.org/10.1007/s11423-022-10137-5

Copy DOI

Abstract

Captions play a major role in making educational videos accessible to all and are known to benefit a wide range of learners. However, many educational videos either do not have captions or have inaccurate captions. Prior work has shown the benefits of using crowdsourcing to obtain accurate captions in a cost-efficient way, though there is a lack of understanding of how learners edit captions of educational videos either individually or collaboratively. In this work, we conducted a user study where 58 learners (in a course of 387 learners) participated in the editing of captions in 89 lecture videos that were generated by Automatic Speech Recognition (ASR) technologies. For each video, different learners conducted two rounds of editing. Based on editing logs, we created a taxonomy of errors in educational video captions (e.g., Discipline-Specific, General, Equations). From the interviews, we identified individual and collaborative error editing strategies. We then further demonstrated the feasibility of applying machine learning models to assist learners in editing. Our work provides practical implications for advancing video-based learning and for educational video caption editing.

Full Text