Abstract

Twitter data are becoming an important part of modern political science research, but key aspects of the inner workings of Twitter streams as well as self-censorship on the platform require further research. A particularly important research agenda is to understand removal rates of politically charged tweets. In this article, I provide a strategy to understand removal rates on Twitter, particularly on politically charged topics. First, the technical properties of Twitter’s API that may distort the analyses of removal rates are tested. Results show that the forward stream does not capture every possible tweet –between 2 and 5 percent of tweets are lost on average, even when the volume of tweets is low and the firehose not needed. Second, data from Twitter’s streams are collected on contentious topics such as terrorism or political leaders and non-contentious topics such as types of food. The statistical technique used to detect uncommon removal rate patterns is multilevel analysis. Results show significant differences in the removal of tweets between different topic groups. This article provides the first systematic comparison of information loss and removal on Twitter as well as a strategy to collect valid removal samples of tweets.

Highlights

  • Researchers across the social sciences are becoming increasingly interested in using Twitter data in their studies and in understanding its limitations [1,2,3,4]

  • The data are extraordinarily abundant and readily available to the public through the company’s two main APIs. This is attractive in those social science fields in which data collection is often an arduous and expensive process

  • Results show that the forward stream does not capture every possible tweet –between 2 and 5 percent of tweets are lost on average, even when the volume of tweets is low and the firehose not needed

Read more

Summary

Introduction

Researchers across the social sciences are becoming increasingly interested in using Twitter data in their studies and in understanding its limitations [1,2,3,4]. The data are extraordinarily abundant and readily available to the public through the company’s two main APIs (forward stream and backward search). This is attractive in those social science fields in which data collection is often an arduous and expensive process. A important research agenda is to understand removal rates of politically charged tweets. This article provides strategies to understand removal rates on Twitter and to detect anomalies on politically charged topics.

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.