Impact of Deep Learning on Localizing and Recognizing Handwritten Text in Lecture Videos

Lakshmi Haritha Medida,Kasarapu Ramani

doi:10.14569/ijacsa.2021.0120442

Lakshmi Haritha Medida, Kasarapu Ramani

Open Access

https://doi.org/10.14569/ijacsa.2021.0120442

Copy DOI

Abstract

Now-a-days, the video recording technologies have turned out to be more and more forceful and easier to utilize. Therefore, numerous universities are recording and publishing their lectures online in order to make them reachable for learners or students. These lecture videos encapsulate the handwritten text written either on a paper or blackboard or on a tablet using a stylus. On the other hand, this mechanism of recording the lecture videos consumes huge quantity of multimedia data in a faster manner. Thus, handwritten text recognition on the lecture video portals has turned out to be an incredibly significant and demanding task. Thus, this paper intends to develop a novel handwritten text detection and recognition approach on the video lecture dataset by following four major phases, viz. (a) Text Localization, (b) Segmentation (c) Pre-processing and (d) Recognition. The text localization in the lecture video frames is the initial phase and here the arbitrarily oriented text on video frames is localized using the Modified Region Growing (MRG) algorithm. Then, the localized words are subjected to segmentation via the K-means clustering, in which the words from the detected text regions are segmented out. Subsequently, the segmented words are pre-processed to avoid the blurriness artifacts as well. Finally, the pre-processed words are recognized using the Deep Convolutional Neural Network (DCNN). The performance of the proposed model is analyzed in terms of the performance measures like accuracy, precision, sensitivity and specificity to exhibit the supremacy of the text detection and recognition in lecture video. Experimental results reveal that at Learning Percentage of 70, the presented work has the highest accuracy of 89.3% for 500 count of frames.

Highlights

In the recent days, the professional lecture videos are abundant and the number is constantly growing in the web
Optical Character Recognition (OCR) has been considered as a solved problem, Handwritten Text Recognition a crucial component of OCR is still a challenging problem statement
This paper presented a novel text detection and recognition approach on the video lecture dataset by following four major phases, viz. (a) text localization, (b) segmentation and (c) preprocessing and (d) recognition

Summary

Introduction

The professional lecture videos are abundant and the number is constantly growing in the web. These lecture videos are motivating the students towards teleteaching and e-learning [1] [2] [3] [4]. The better understanding of the lecture video lies in the vital cues like the figures, images and text [8] [9] [10] [11]. Among these vital cues, the text is available in

Methods

Results

Conclusion