Abstract
Background: With the technological advancement, the quality of life of a human were improved. Also with the technological advancement large amount of data were produced by human. The data is in the forms of text, images and videos. Hence there is a need for significant efforts and means of devising methodologies for analyzing and summarizing them to manage with the space constraints. Video summaries can be generated either by keyframes or by skim/shot. The keyframe extraction is done based on deep learning based object detection techniques. Various object detection algorithms have been reviewed for generating and selecting the best possible frames as keyframes. A set of frames were extracted out of the original video sequence and based on the technique used, one or more frames of the set are decided as a keyframe, which then becomes the part of the summarized video. The following paper discusses the selection of various keyframe extraction techniques in detail. Methods : The research paper is focused at summary generation for office surveillance videos. The major focus for the summary generation is based on various keyframe extraction techniques. For the same various training models like Mobilenet, SSD, and YOLO were used. A comparative analysis of the efficiency for the same showed YOLO giving better performance as compared to the others. Keyframe selection techniques like sufficient content change, maximum frame coverage, minimum correlation, curve simplification, and clustering based on human presence in the frame have been implemented. Results: Variable and fixed length video summaries were generated and analyzed for each keyframe selection techniques for office surveillance videos. The analysis shows that he output video obtained after using the Clustering and the Curve Simplification approaches is compressed to half the size of the actual video but requires considerably less storage space. The technique depending on the change of frame content between consecutive frames for keyframe selection produces the best output for office room scenarios. The technique depending on frame content between consecutive frames for keyframe selection produces the best output for office surveillance videos. Conclusion: In this paper, we discussed the process of generating a synopsis of a video to highlight the important portions and discard the trivial and redundant parts. First, we have described various object detection algorithms like YOLO and SSD, used in conjunction with neural networks like MobileNet to obtain the probabilistic score of an object that is present in the video. These algorithms generate the probability of a person being a part of the image, for every frame in the input video. The results of object detection are passed to keyframe extraction algorithms to obtain the summarized video. From our comparative analysis for keyframe selection techniques for office videos will help in determining which keyframe selection technique is preferable.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: Recent Advances in Computer Science and Communications
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.