Abstract

Grouping video contents into semantic segments is the crucial pass to content-based video summarization and retrieval. In this paper, we present a novel scene segmentation and semantic representation scheme for various video types. We first detect video shot using a coarse-to-fine algorithm. The key frames without useful information are detected and removed using template matching. Spatio-temporal coherent shots are then grouped into the same scene based on the temporal constraint of video content and visual similarity of shot activity. With general editing technique used in the continuously recorded video, semantic representation of scene content is specified to satisfy human demand on video retrieval. The proposed algorithm has been performed on various types of videos containing movie and TV program. Promising experimental results shows that the proposed method makes sense to efficient retrieval of video contents of interest.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call