An Analysis of the Errors in the Auto-Generated Captions of University Commencement Speeches on YouTube

Jeong-Hwa Lee,Kyung-Whan Cha

doi:10.18823/asiatefl.2020.17.1.9.143

Abstract

Auto-generated captions on YouTube have proven useful in helping viewers better understand the words being spoken. However, at times they fail to contain accurate captions. In these cases, they lead to confusion. The aim of this paper is to identify and analyze errors in the auto-generated captions of 20 commencement speeches on YouTube. These speeches were presented over a period of 12 years by speakers from different walks of life. The researchers selected ten male and ten female icons. Only the first 10 minutes of the speeches were utilized for this investigation. All the captioned errors were collected and analyzed. Upon completion of the analysis, it was discovered that the frequency of errors in each speech ranged between 10 and 46 cases, with an average of one error occurring about every 26 seconds. Among the different error categories, nouns record the highest number with 144 cases (31.3%). The second is verbs with 93 cases (20.2%), then prepositions with 37 cases (8.1%). Among the four subcategories, namely omission, addition, substitution, and word order, substitution recorded the highest amount of errors with 357 cases (77.6%). Furthermore, the errors were classified into two major groups. The first, involving function words, appeared in 169 cases (36.7%). The second, involving content words, appeared in 291 cases (63.3%). The results of this research suggest that a continuous development of the voice recognition software that automatically generates captions is necessary for more efficient and accurate data that will help viewers and listeners better comprehend the video contents.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

An Analysis of the Errors in the Auto-Generated Captions of University Commencement Speeches on YouTube

Abstract

Talk to us

Similar Papers

More From: The Journal of AsiaTEFL

Lead the way for us

Journal: The Journal of AsiaTEFL	Publication Date: Mar 31, 2020
Citations: 3

Similar Papers

Grammatical errors produced by English majors: The translation task
Hamid Mohaghegh
Educational Research and Reviews | VOL. 6
Hamid Mohaghegh Hamid Mohaghegh
25 Oct 2011
Educational Research and Reviews | VOL. 6

Assessment of Frequency of Errors in Conventional Panoramic Radiographs
Abedeera Jayasuriya Seena Patabedige Nileema
International Journal of Dental Medicine | VOL. 2
Abedeera Jayasuriya Seena Patabedige NileemaAbedeera Jayasuriya Seena Patabedige Nileema
01 Jan 2015
International Journal of Dental Medicine | VOL. 2

Software Specification and Documentation in Continuous Software Development
U Van Heesch ... U Zdun
-
U Van Heesch, et. al.U Van Heesch ... U Zdun
12 Jul 2017
12 Jul 2017

A mapping study on documentation in Continuous Software Development
Theo Theunissen ... Uwe Van Heesch
Information and Software Technology | VOL. 142
Theo Theunissen, et. al.Theo Theunissen ... Uwe Van Heesch
01 Feb 2022
Information and Software Technology | VOL. 142

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

An Analysis of the Errors in the Auto-Generated Captions of University Commencement Speeches on YouTube

Abstract

Talk to us

Similar Papers

More From: The Journal of AsiaTEFL