From Keyframes to Key Objects: Video Summarization by Representative Object Proposal Selection

Jingjing Meng,Junsong Yuan,Yap-Peng Tan,Hongxing Wang

doi:10.1109/cvpr.2016.118

Abstract

We propose to summarize a video into a few key objects by selecting representative object proposals generated from video frames. This representative selection problem is formulated as a sparse dictionary selection problem, i.e., choosing a few representatives object proposals to reconstruct the whole proposal pool. Compared with existing sparse dictionary selection based representative selection methods, our new formulation can incorporate object proposal priors and locality prior in the feature space when selecting representatives. Consequently it can better locate key objects and suppress outlier proposals. We convert the optimization problem into a proximal gradient problem and solve it by the fast iterative shrinkage thresholding algorithm (FISTA). Experiments on synthetic data and real benchmark datasets show promising results of our key object summarization apporach in video content mining and search. Comparisons with existing representative selection approaches such as K-mediod, sparse dictionary selection and density based selection validate that our formulation can better capture the key video objects despite appearance variations, cluttered backgrounds and camera motions.

Full Text