Abstract
The increased number of video cameras makes an explosive growth in the amount of captured video, especially the increase of millions of surveillance cameras that operate 24 hours a day. Since video browsing and retrieval is time consuming, while video synopsis is one of the most effective ways for browsing and indexing such video that enables the review of hours of video in just minutes. How to generate the video synopsis and preserve the essential activities in the original video is still a costly and labor-intensive and time-intensive work. This paper proposes an approach to generating video synopsis with complete foreground and clearer trajectory of moving objects. Firstly, the one-stage CNN-based object detecting has been employed in object extraction and classification. Then, combining integrating the attention-RetinaNet with Local Transparency-Handling Collision (LTHC) algorithm is given out which results in the trajectory combination optimization and makes the trajectory of the moving object more clearly. Finally, the experiments show that the useful video information is fully retained in the result video, the detection accuracy is improved by 4.87% and the compression ratio reaches 4.94, but the reduction of detection time is not obvious.
Highlights
Video synopsis is a task with important research value and practical significance
We provide a method which realizes the classify video synopsis, the difference from the work in [30] and that in this paper is that we use the labels generated by Convolution neural network (CNN)-based detector itself to classify objects instead of adding an extra network to do classification work, and it reduces the amount of additional neural network design work and computation of the algorithm
Attention-RetinaNet object method, Second, Local Transparency-Handling Collision (LTHC) method. These two methods improve the quality of video synopsis in object detection and activity rearrangement respectively
Summary
Video synopsis is a task with important research value and practical significance. With the development of 7 × 24 video surveillance system, the amount of surveillance video data is growing sharply. How to conduct video browsing and retrieval efficiently becomes a challenging work. In road monitoring or pedestrian monitoring, some moving objects needs to be identified and analyzed in a very long video. The traditional method is to browse the video by manually controlling the playback, which costs too much time. Some important information will be missed due to the people’s own reasons. Original videos contain a large number of ‘‘static’’ frames, each frame only contains a background, without any moving objects. Static frame is short of useful information, such as object moving
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.