Abstract

This paper presents a robust approach to extracting content from instructional videos for handwritten recognition, indexing and retrieval, and other e-learning applications. For the instructional videos of chalkboard presentations, retrieving the handwritten content (e.g., characters, drawings, figures) on boards is the first and prerequisite step towards further exploration of instructional video content. However, content extraction in instructional videos is still challenging due to video noise, non-uniformity of the color in board regions, light condition changes in a video session, camera movements, and unavoidable occlusions by instructors. To solve this problem, we first segment video frames into multiple regions and estimate the parameters of the board regions based on statistical analysis of the pixels in dominant regions. Then we accurately separate the board regions from irrelevant regions using a probabilistic classifier. Finally, we combine top-hat morphological processing with a gradient-based adaptive thresholding technique to retrieve content pixels from the board regions. Evaluation of the content extraction results on four full-length instructional videos shows the high performance of the proposed method. The extraction of content text facilitates the research on full exploitation of instructional videos, such as content enhancement, indexing, and retrieval.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call