Abstract

Video summarization addresses generating video summaries to help watchers grasp the content of a video without watching it entirely. Many methods have engaged in automatic video summarization. Although these methods have performed well, they still suffer from limited training data and sparse reward problems. We propose a Progressive Reinforcement Learning Video Summarization structure (PRLVS) with an unsupervised reward. The reward measures the information and quality the selected frames convey without annotations. Striving to earn higher rewards, our PRLVS adopts a “T”-type human thinking paradigm: choosing some key frames and checking if their adjacent frames are better than them. To simulate this paradigm, we decompose the flat strategy into a hierarchical strategy consisting of a horizontal policy and a vertical policy. These two policies are optimized alternatively, which densifies the reward while reducing the exploration space. Their cooperation also makes the agent capture the context information of the whole video at every step. Extensive experimental results on two benchmark databases (i.e., SumMe, TVSum) show that our PRLVS outperforms the comparisons and approaches the supervised methods, which indicates that it is significant to integrate our unsupervised reward into the progressive reinforcement learning structure to address limited annotation and sparse reward problems.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.