Abstract

Keyframe extraction for shot representation is the most common video summarization approach. Any reliable keyframe extraction algorithm should automatically detect the number of keyframes, while extracting non-repetitive keyframes that can efficiently summarize the video content. Moreover, it is important that key-frame extraction is performed in reasonable time. The proposed method is based on a moving window of successive frames that slides over the whole frame sequence (shot). The set of frames included in each window is tested for content homogeneity using an appropriate unimodality test. Thus, each window is characterized as unimodal or not and the frame sequence of each non-unimodal window is splitted into two (possibly unimodal) segments. In this way, each video shot is segmented into unimodal segments and the key-frames are computed as the representative frames (medoids) of each unimodal segment. An important aspect of the above method is that it does not require the number of keyframes to be specified in advance, since the number of segments is computed automatically. Numerical experiments demonstrate that our method provides reasonable estimates of the number of ground-truth keyframes, while extracting non-repetitive keyframes that efficiently summarize the content of each shot.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call