Fast and Robust Text Detection in MOOCs videos

Huasong Zhong,Yuchun Ma

doi:10.1145/3231848.3231856

Abstract

Nowadays, Massive Open Online Courses (MOOCs) [1] have become increasingly popular and people all over the world tend to study online. Meanwhile, texts in Massive Open Online Courses (MOOCs), which carry main contents of these videos, play a significant role in content-based video summarization, analysis, retrieval system and knowledge extraction. However due to the characteristics of MOOCs videos, the conventional text detection methods and natural scene text detection algorithms are not suitable for text detection and recognition in MOOCs videos. In this paper, we present a novel text detection and recognition flow based on MOOCs' features for these videos. First, we propose a new candidate character region detector called Pruned-MSER which tries to reduce many non-character and overlapping regions. Then the line-level clustering algorithm is performed to group candidate character regions to lines. Finally, we use a Convolution Neural Network(CNN) as the text line classifier. The text detection algorithm is compared with existing methods and evaluated with MOOCs videos benchmark. Our algorithm shows a robust detection performance with over 95% in F-measure.

Full Text