Sequence-guided siamese neural network for video summarization of unmanned aerial vehicles

Jin Chen,Zehan Chen,Yuexian Zou,Yi Wang

doi:10.1109/icdsp.2017.8096070

Abstract

Video summarization (VS) is one of key video signal processing techniques for unmanned aerial vehicles (UAVs). Essentially VS aims at eliminating redundant frames in aerial videos (AVs) with high similarity, which is helpful for quick browsing, retrieving and efficient storage without losing important information. For VS technique, how to measure the similarity between video frames is not a trivial work since VS for different video applications asks for different criteria. Besides, it is noted that aerial video frames of UAVs nearly only have gradual changes other than sudden changes between frames. In this paper, to capture the subtle variations between UAV frames, a sequence-guided Siamese neural network (SGSNN) approach is developed to extract the semantic features. Specifically, the sequential correlations have been incorporated into the learning objective of the proposed SGSNN, which is implemented by a logarithmic based semantic distance metric designed to automatically label the similarity between frames. For gradual transition shot segmentation in AVs, a co-voting method is presented to decide the membership of the input frame belongs to. Extensive experiments on the self-made UAV video datasets validate the effectiveness of our method.

Full Text