Abstract

Video summarization (VS) is one of key video signal processing techniques for unmanned aerial vehicles (UAVs). Essentially VS aims at eliminating redundant frames in aerial videos (AVs) with high similarity, which is helpful for quick browsing, retrieving and efficient storage without losing important information. For VS technique, how to measure the similarity between video frames is not a trivial work since VS for different video applications asks for different criteria. Besides, it is noted that aerial video frames of UAVs nearly only have gradual changes other than sudden changes between frames. In this paper, to capture the subtle variations between UAV frames, a sequence-guided Siamese neural network (SGSNN) approach is developed to extract the semantic features. Specifically, the sequential correlations have been incorporated into the learning objective of the proposed SGSNN, which is implemented by a logarithmic based semantic distance metric designed to automatically label the similarity between frames. For gradual transition shot segmentation in AVs, a co-voting method is presented to decide the membership of the input frame belongs to. Extensive experiments on the self-made UAV video datasets validate the effectiveness of our method.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.