Abstract

The increased number of video cameras makes an explosive growth in the amount of captured video, especially the increase of millions of surveillance cameras that operate 24 hours a day. Since video browsing and retrieval is time consuming, while video synopsis is one of the most effective ways for browsing and indexing such video that enables the review of hours of video in just minutes. How to generate the video synopsis and preserve the essential activities in the original video is still a costly and labor-intensive and time-intensive work. This paper proposes an approach to generating video synopsis with complete foreground and clearer trajectory of moving objects. Firstly, the one-stage CNN-based object detecting has been employed in object extraction and classification. Then, combining integrating the attention-RetinaNet with Local Transparency-Handling Collision (LTHC) algorithm is given out which results in the trajectory combination optimization and makes the trajectory of the moving object more clearly. Finally, the experiments show that the useful video information is fully retained in the result video, the detection accuracy is improved by 4.87% and the compression ratio reaches 4.94, but the reduction of detection time is not obvious.

Highlights

  • Video synopsis is a task with important research value and practical significance

  • We provide a method which realizes the classify video synopsis, the difference from the work in [30] and that in this paper is that we use the labels generated by Convolution neural network (CNN)-based detector itself to classify objects instead of adding an extra network to do classification work, and it reduces the amount of additional neural network design work and computation of the algorithm

  • Attention-RetinaNet object method, Second, Local Transparency-Handling Collision (LTHC) method. These two methods improve the quality of video synopsis in object detection and activity rearrangement respectively

Read more

Summary

Introduction

Video synopsis is a task with important research value and practical significance. With the development of 7 × 24 video surveillance system, the amount of surveillance video data is growing sharply. How to conduct video browsing and retrieval efficiently becomes a challenging work. In road monitoring or pedestrian monitoring, some moving objects needs to be identified and analyzed in a very long video. The traditional method is to browse the video by manually controlling the playback, which costs too much time. Some important information will be missed due to the people’s own reasons. Original videos contain a large number of ‘‘static’’ frames, each frame only contains a background, without any moving objects. Static frame is short of useful information, such as object moving

Methods
Findings
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call