Abstract

This article presents the results of methodological experimentation that utilises machine learning to investigate automated copyright enforcement on YouTube. Using a dataset of 76.7 million YouTube videos, we explore how digital and computational methods can be leveraged to better understand content moderation and copyright enforcement at a large scale.We used the BERT language model to train a machine learning classifier to identify videos in categories that reflect ongoing controversies in copyright takedowns. We use this to explore, in a granular way, how copyright is enforced on YouTube, using both statistical methods and qualitative analysis of our categorised dataset. We provide a large-scale systematic analysis of removals rates from Content ID’s automated detection system and the largely automated, text search based, Digital Millennium Copyright Act notice and takedown system. These are complex systems that are often difficult to analyse, and YouTube only makes available data at high levels of abstraction. Our analysis provides a comparison of different types of automation in content moderation, and we show how these different systems play out across different categories of content. We hope that this work provides a methodological base for continued experimentation with the use of digital and computational methods to enable large-scale analysis of the operation of automated systems.

Highlights

  • YouTube’s baseline legal obligations for enforcing copyright are set out by the notice-and-takedown system established under the United States Digital Millennium Copyright Act (DMCA) legislation and propagated around the world.1 Noticeand-takedown has become an extremely important industrial mechanism for enforcing copyright; copyright owners employ rights management companies who use automated search tools to send hundreds of complaints of notices every year (Urban et al, 2017)

  • When a video is blocked by Content ID, YouTube will still host a link to the video and will provide an error message explaining that the video was blocked due to a copyright claim

  • Videos were most frequently removed from YouTube by users themselves, followed by removals due to an account termination and Content ID blocks (See Table I)

Read more

Summary

Introduction

YouTube’s baseline legal obligations for enforcing copyright are set out by the notice-and-takedown system established under the United States DMCA legislation and propagated around the world. Noticeand-takedown has become an extremely important industrial mechanism for enforcing copyright; copyright owners employ rights management companies who use automated search tools to send hundreds of complaints of notices every year (Urban et al, 2017). YouTube’s baseline legal obligations for enforcing copyright are set out by the notice-and-takedown system established under the United States DMCA legislation and propagated around the world.. Noticeand-takedown has become an extremely important industrial mechanism for enforcing copyright; copyright owners employ rights management companies who use automated search tools to send hundreds of complaints of notices every year (Urban et al, 2017). YouTube goes beyond its obligations under the DMCA when enforcing copyright, and has developed a series of additional tools and privately negotiated systems and policies (Bridy, forthcoming). The most visible of these tools is Content ID, YouTube’s automated rights management system that allows rightsholders to block, monetise, mute or track videos that contain their works. Google reports that on YouTube, 98% of copyright matters are decided by Content ID (Google, 2016)

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.