Abstract
Adapting detection to different shot types is a significant challenge for video summarization methods based on shot boundary detection. In our recent work, a new graph model was introduced in the feature modelling of frames and analysed for changes in graph structure to improve the detection of shot boundaries. In this paper, we further explore the potential of graph models and propose a more general framework for online, real-time automatic video summarization. The framework develops a novel adaptive multiview graph difference analysis method to improve the algorithm’s robustness in detecting different shot transitions. Previous fusion methods typically used a priori knowledge to assign weights to the various feature differences from videos. In contrast, our framework can weigh and fuse the resulting differences by learning the importance of various video features from the structural changes of the corresponding multiview graphs. Additionally, we propose a new threshold-based adaptive decision method which can dynamically select the most accurate shot boundary decision threshold by analysing a small number of historical frames and learning the tolerance factor in the current shot. The experimental results show that the proposed method outperforms state-of-the-art methods in terms of precision and F-score on the VSUMM and YouTube datasets.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have