Mining One Percent of Twitter: Collections, Baselines, Sampling

Carolin Gerlitz,Bernhard Rieder

doi:10.5204/mcj.620

Abstract

The objective of the paper is to reflect on the affordances of different techniques for making Twitter collections and to suggest the use of a random sampling technique, made possible by Twitter’s Streaming API (Application Programming Interface), for baselining, scoping, and contextualising practices and issues. It discusses this technique by analysing a one per cent sample of all tweets posted during a 24-hour period and introducing a number of analytical directions considered useful for qualifying some of the core elements of the platform, in particular hashtags. To situate the proposal, the report first discusses how platforms propose particular affordances but leave considerable margins for the emergence of a wide variety of practices. This argument is then related to the question of how medium and sampling technique are intrinsically connected. Background Social media platforms present numerous challenges to empirical research, making it different from researching cases in offline environments, but also different from studying the “open” Web. Because of the limited access possibilities and the sheer size of platforms like Facebook or Twitter, the question of delimitation, i.e. the selection of subsets to analyse, is particularly relevant. Whilst sampling techniques have been thoroughly discussed in the context of social science research, sampling procedures in the context of social media analysis are far from being fully understood. Even for Twitter, a platform having received considerable attention from empirical researchers due to its relative openness to data collection, methodology is largely emergent. In particular the question of how smaller collections relate to the entirety of activities of the platform is quite unclear. Recent work comparing case based studies to gain a broader picture and the development of graph theoretical methods for sampling are certainly steps in the right direction, but it seems that truly large-scale Twitter studies are limited to computer science departments, where epistemic orientation can differ considerably from work done in the humanities and social sciences.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: M/C Journal	Publication Date: Mar 2, 2013
Citations: 67	License type: cc-by-nc-nd

R Discovery Prime

R Discovery Prime

Mining One Percent of Twitter: Collections, Baselines, Sampling

Abstract

Talk to us

Similar Papers

More From: M/C Journal

Lead the way for us

Similar Papers

Twitter Archives and the Challenges of "Big Social Data" for Media and Communication Research
Jean Burgess ... Axel Bruns
M/C Journal | VOL. 15
Jean Burgess, et. al.Jean Burgess ... Axel Bruns
11 Oct 2012
M/C Journal | VOL. 15

Data Curation Strategies to Support Responsible Big Social Research and Big Social Data Reuse
Sara Mannheimer
International Journal of Digital Curation | VOL. 17
Sara MannheimerSara Mannheimer
06 Dec 2022
International Journal of Digital Curation | VOL. 17

Big Social Data Approaches in Internet Studies: The Case of Twitter
Axel Bruns
-
Axel BrunsAxel Bruns
01 Jan 2018
01 Jan 2018

“Meaning” in Social Media
Stine Lomborg
Social Media + Society | VOL. 1
Stine LomborgStine Lomborg
01 Apr 2015
Social Media + Society | VOL. 1

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Mining One Percent of Twitter: Collections, Baselines, Sampling

Abstract

Talk to us

Similar Papers

More From: M/C Journal