Abstract
An efficient shot summarization method is presented based on agglomerative clustering of the shot frames. Unlike other agglomerative methods, our approach relies on a cluster merging criterion that computes the content homogeneity of a merged cluster. An important feature of the proposed approach is the automatic estimation of the number of a shot's most representative frames, called keyframes. The method starts by splitting each video sequence into small, equal sized clusters (segments). Then, agglomerative clustering is performed, where from the current set of clusters, a pair of clusters is selected and merged to form a larger unimodal (homogeneous) cluster. The algorithm proceeds until no further cluster merging is possible. At the end, the medoid of each of the final clusters is selected as keyframe and the set of keyframes constitutes the summary of the shot. Numerical experiments demonstrate that our method reasonable estimates the number of ground-truth keyframes, while extracting non-repetitive keyframes that efficiently summarize the content of each shot.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have