Abstract
This paper presents a robust approach to extracting and summarizing the textual content of instructional videos for handwritten recognition, indexing and retrieval, and other e-learning applications. Content extraction from instructional videos is challenging due to image noise, light condition changes, camera movements, and unavoidable occlusions by instructors. In this paper, we develop a probabilistic model to accurately detect board regions and an adaptive thresholding technique to extract the written chalk pixels on black-boards. We further compute instructional video key frames by analyzing the fluctuation of the number of chalk pixels. By matching the textual content of video frames using a Hausdorff-distance-based technique, we reduce the content redundancy among the key frames. Performance evaluation on three full-length instructional videos shows that our algorithm is highly effective in summarizing instructional video content and achieves very low content missing rates.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have