Abstract

In this paper we propose an integrated framework of automatic bilingual subtitle generation for lecture videos, especially for MOOCs. The framework consists of Automatic Speech Recognition (ASR), Sentence Boundary Detection (SBD), and Machine Translation (MT). Then we quantitatively evaluate the auto-generated subtitles, the manually produced subtitles from scratch, and the auto-generated subtitles with manual modification in term of accuracy and time expenditure, in both original and target languages. The result shows that the auto-generated subtitles in the original language (English) are fairly accurate already. By using them as the draft, human subtitle producers can save 54% of the working time and simultaneously reduce the error rate by 54.3%, which is a significant improvement. However, the effectiveness of machine translated subtitles (English to Chinese) is limited. In the end, if the proposed framework is applied, the total working time in preparing bilingual subtitles can be shortened by approximately 1/3, with no decline in quality.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call