Social media video summarization using multi-Visual features and Kohnen's Self Organizing Map

Seema Rani,Mukesh Kumar

doi:10.1016/j.ipm.2019.102190

Abstract

Social networking tools such as Facebook, YouTube, Twitter, and Instagram, are becoming major platforms for communication. YouTube as one of the primary video sharing platform serves over 100 million distinct videos, 300 hours of videos are uploaded on YouTube every minute along with textual data. This massive amount of multimedia data needs to be managed with high efficiency, the irrelevant and redundant data needs to be removed. Video summarization ideals with the problem of redundant data in a video. A summarized video contains the most distinct frames which are termed as key frames. Most of the research work on key frames extraction considers only a single visual feature which is not sufficient for capturing the full pictorial details and hence affecting the quality of video summary generated. So there is a need to explore multiple visual features for key frames extraction. In this research work a key frame extraction technique based upon fusion of four visual features namely: correlation of RGB color channels, color histogram, mutual information and moments of inertia is proposed. Kohonen Self Organizing map as a clustering approach is used to find the most representative frames from the list of frames coming after fusion. Useless frames are discarded and frames having maximum Euclidean distance within a cluster are selected as final key frames. The results of the proposed technique are compared with the existing video summarization techniques: User generated summary, Video SUMMarization (VSUMM), and Video Key Frame Extraction through Dynamic Delaunay Clustering (VKEDDCSC) which shows a considerable improvement in terms of fidelity and Shot Reconstruction Degree (SRD) score.

Full Text