Abstract
Video summarization is to extract effective information from videos to quickly obtain the most informative summary. Most of the existing video summarization methods use recurrent neural networks and their variants such as long and short-term memory (LSTM), to simulates the variable range time dependence between video frames. However, those methods can only process serial inputs of the video frames along with the hidden layer information from the previous time step, which affects the performance and the quality of video summarization. To tackle this issue, we present a deep non-local video summarization network (DN-VSN) for original video abstracts in this paper. Our unsupervised model treats video summarization as a sequence of decision problems. Given an input video, the probability that a video frame is selected as a part of the summary is obtained through a non-local convolutional network, and a strategy gradient algorithm of reinforcement learning is adopted for optimization in the training phase. The proposed method has been tested on four widely used datasets. The experimental results show the superiority of the proposed unsupervised model over the state-of-the-art methods.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.