Multi-Modal Deep Learning based Metadata Extensions for Video Clipping

Woo-Hyeon Kim,Joo-Chang Kim,Geon-Woo Kim

doi:10.18517/ijaseit.14.1.19047

Abstract

General video search and recommendation systems primarily rely on metadata and personal information. Metadata includes file names, keywords, tags, and genres, among others, and is used to describe the video's content. The video platform assesses the relevance of user search queries to the video metadata and presents search results in order of highest relevance. Recommendations are based on videos with metadata judged to be similar to the one the user is currently watching. Most platforms offer search and recommendation services by employing separate algorithms for metadata and personal information. Therefore, metadata plays a vital role in video search. Video service platforms develop various algorithms to provide users with more accurate search results and recommendations. Quantifying video similarity is essential to enhance the accuracy of search results and recommendations. Since content producers primarily provide basic metadata, it can be abused. Additionally, the resemblance between similar video segments may diminish depending on its duration. This paper proposes a metadata expansion model that utilizes object recognition and Speech-to-Text (STT) technology. The model selects key objects by analyzing the frequency of their appearance in the video, extracts audio separately, transcribes it into text, and extracts the script. Scripts are quantified by tokenizing them into words using text-mining techniques. By augmenting metadata with key objects and script tokens, various video content search and recommendation platforms are expected to deliver results closer to user search terms and recommend related content.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Multi-Modal Deep Learning based Metadata Extensions for Video Clipping

Abstract

Talk to us

Similar Papers

More From: International Journal on Advanced Science, Engineering and Information Technology

Lead the way for us

Journal: International Journal on Advanced Science, Engineering and Information Technology	Publication Date: Feb 28, 2024
License type: CC BY 4.0

Similar Papers

Ontology based user query interpretation for semantic multimedia contents retrieval
Moo-Hun Lee ... Seungmin Rho
Multimedia Tools and Applications | VOL. 73
Moo-Hun Lee, et. al.Moo-Hun Lee ... Seungmin Rho
07 Mar 2013
Multimedia Tools and Applications | VOL. 73

Predicting Failing Queries in Video Search
Christoph Kofler ... Shipeng Li
IEEE Transactions on Multimedia | VOL. 16
Christoph Kofler, et. al.Christoph Kofler ... Shipeng Li
01 Nov 2014
IEEE Transactions on Multimedia | VOL. 16

English
Khushal Rathod ... Pratibha Chavan
International Journal of Innovative Research in Science, Engineering and Technology | VOL. 03
Khushal Rathod, et. al.Khushal Rathod ... Pratibha Chavan
15 Dec 2014
International Journal of Innovative Research in Science, Engineering and Technology | VOL. 03

Real-Time tracking of user's interest fields and search result personalization using the national science and technology standard classification
Heejun Han ... Sung-Pil Choi
Indian Journal of Public Health Research & Development | VOL. 9
Heejun Han, et. al.Heejun Han ... Sung-Pil Choi
01 Jan 2018
Indian Journal of Public Health Research & Development | VOL. 9

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Multi-Modal Deep Learning based Metadata Extensions for Video Clipping

Abstract

Talk to us

Similar Papers

More From: International Journal on Advanced Science, Engineering and Information Technology