Abstract

AbstractVideos have become a predominant part of users’ daily lives on the Web, especially with the emergence of online video sharing systems such as YouTube. Since users can independently share videos in these systems, some videos can be duplicates (i.e., identical or very similar videos). Despite having the same content, there are some potential context differences in duplicates, for example, in their associated metadata (i.e., tags, title) and their popularity scores (i.e., number of views, comments). Quantifying these differences is important to understand how users associate metadata to videos and to understand possible reasons that influence the popularity of videos, which is crucial for video information retrieval mechanisms, association of advertisements to videos, and performance issues related to the use of caches and content distribution networks (CDNs). This work presents a wide quantitative characterization of the context differences among identical contents. Using a large video sample collected from YouTube, we construct a dataset of duplicates. Our measurement analysis provides several interesting findings that can have implications for how videos should be retrieved in video sharing websites as well as for advertising systems that need to understand the role that users play when they create content in services such as YouTube.

Highlights

  • Content is rapidly moving towards more video

  • Video search became a popular service on the Web, and YouTube accounts for a large fraction of all Google search queries in the US, generating 3.5 billion searches in August 2009 [10]

  • The first one refers to the need for understanding how users associate metadata to videos on video sharing services, such as YouTube

Read more

Summary

Introduction

Content is rapidly moving towards more video. The signs are evident. We can see that do these videos have differences in terms of the metadata associated to them, but they present different statistics that indicate popularity. Despite having similar content, duplicated videos may exhibit different metadata (e.g., tags and categories) and may have different popularity indicators (e.g., number of views and ratings). The first one refers to the need for understanding how users associate metadata to videos on video sharing services, such as YouTube. Our measurement analysis provides several interesting findings that can have implications for how duplicated videos should be retrieved in video sharing websites as well as for advertising systems that need to understand the role of duplicated videos in services such as YouTube.

Main findings
Related work
Data collection
8: Collect information of all videos uploaded by u
Contextual analysis
Quality and popularity
Duplicate owners
Metadata
Categories
Duplicate content creation
Users and their duplicates
Suspicious duplicate creation
Findings
Concluding remarks
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.