An augmented multilingual Twitter dataset for studying the COVID-19 infodemic.

Christian E Lopez,Caleb Gallemore

doi:10.1007/s13278-021-00825-0

Christian E Lopez, Caleb Gallemore

Open Access

https://doi.org/10.1007/s13278-021-00825-0

Copy DOI

Journal: Social Network Analysis and Mining	Publication Date: Oct 20, 2021
Citations: 27	License type: cc-by-nc-sa

Affiliation: Lafayette College

Abstract

This work presents an openly available dataset to facilitate researchers’ exploration and hypothesis testing about the social discourse of the COVID-19 pandemic. The dataset currently consists of over 2.2 billions tweets (count as of September, 2021), from all over the world, in multiple languages. Tweets start from January 22, 2020, when the total cases of reported COVID-19 were below 600 worldwide. The dataset was collected using the Twitter API and by rehydrating tweets from other available datasets, data collection is ongoing as of the time of writing. To facilitate hypothesis testing and exploration of social discourse, the English and Spanish tweets have been augmented with state-of-the-art Twitter Sentiment and Named Entity Recognition algorithms. The dataset and the summary files provided allow researchers to avoid some computationally intensive analyses, facilitating more widespread use of social media data to gain insights on issues such as (mis)information diffusion, semantic networks, sentiments, and the evolution of COVID-19 discussions. In addition, the dataset provides an archive for researchers in the social sciences wishing to have access to a dataset covering the entire duration of the pandemic.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

An augmented multilingual Twitter dataset for studying the COVID-19 infodemic.

Abstract

Talk to us

Similar Papers

More From: Social Network Analysis and Mining

Lead the way for us

Similar Papers

Analyzing public discourse of dementia from Spanish and English tweets: a comparative analysis with other neurological disorders.
Javier Domingo-Espiñeira ... Miguel Angel Alvarez-Mon
Frontiers in neurology | VOL. 15
Javier Domingo-Espiñeira, et. al.Javier Domingo-Espiñeira ... Miguel Angel Alvarez-Mon
01 Jan 2024
Frontiers in neurology | VOL. 15

Applying Social Network Analysis to Compare Dementia Caregiving Networks on Twitter in Hispanic and Black Communities.
Sunmoo Yoon ... Peter Broadwell
Studies in health technology and informatics | VOL. 289
Sunmoo Yoon, et. al.Sunmoo Yoon ... Peter Broadwell
14 Jan 2022
Studies in health technology and informatics | VOL. 289

There and Back Again: A Commentary on Social Welfare Policy in the Wake of 2020
Jennifer Romich ... Maria Y Rodriguez
Journal of the Society for Social Work and Research | VOL. 12
Jennifer Romich, et. al.Jennifer Romich ... Maria Y Rodriguez
17 Feb 2021
There and Back Again: A Commentary on Social Welfare Policy in the Wake of 2020
Jennifer Romich ... Maria Y Rodriguez

A Comparative Study of Bitcoin’s Price Fluctuations and Twitter Sentiments
Sadia Bruce ... Spain
International Journal of Economics, Business and Management Research | VOL. 07
Sadia Bruce, et. al.Sadia Bruce ... Spain
01 Jan 2023
International Journal of Economics, Business and Management Research | VOL. 07

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

An augmented multilingual Twitter dataset for studying the COVID-19 infodemic.

Abstract

Talk to us

Similar Papers

More From: Social Network Analysis and Mining