Abstract

This paper presents a robust approach to extracting and summarizing the textual content of instructional videos for handwritten recognition, indexing and retrieval, and other e-learning applications. Content extraction from instructional videos is challenging due to image noise, light condition changes, camera movements, and unavoidable occlusions by instructors. In this paper, we develop a probabilistic model to accurately detect board regions and an adaptive thresholding technique to extract the written chalk pixels on black-boards. We further compute instructional video key frames by analyzing the fluctuation of the number of chalk pixels. By matching the textual content of video frames using a Hausdorff-distance-based technique, we reduce the content redundancy among the key frames. Performance evaluation on three full-length instructional videos shows that our algorithm is highly effective in summarizing instructional video content and achieves very low content missing rates.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call