Automatic video summarization and classification by CNN model: Deep learning

Surendra Reddy Vinta,N Shilpa,Ajoy Batta,Pavitar Parkash Singh

doi:10.1109/iccci56745.2023.10128303

Abstract

As smartphones and other camera-enabled devices become more mainstream and user-friendly, more people are recording and sharing films through social media and video streaming websites. This makes them an essential tool for spreading information. It’s a pain to watch and evaluate so many movies. An automated video summarizing gives a concise analysis of the source material, which is useful for indexing and categorizing long films in the video database. Putting together a synopsis for a video is an uphill task. By simulating a two-stream architecture with a deep convolutional neural network in each stream to extract a video’s spatial and temporal components, this research hopes to automate the process of making video summaries. Video segment highlight scores may be generated using a two-dimensional Convolutional Neural Network (CNN) that uses spatial information.Additionally, a 3-D convolutional neural network (CNN) includes temporal data. The ratings for each segment in each stream are averaged to determine which portions of the video are the most compelling. Since the highlight result only conveys a relative degree of interest, the DCNN in each stream is trained using a pairwise deep-ranking model. With some model tweaking, we can make the highlighted part of the video score higher than the rest. Videos summaries may be created from the retrieved clips.

Full Text