Abstract

In March 2020, the World Health Organization declared the COVID-19 outbreak to be a pandemic. Soon af-terwards, people began sharing millions of posts on social media without considering their reliability and truthfulness. While there has been extensive research on COVID-19 in the English lan-guage, there is a lack of research on the subject in Arabic. In this paper, we address the problem of detecting fake news surrounding COVID-19 in Arabic tweets. We collected more than seven million Arabic tweets related to the corona virus pandemic from January 2020 to August 2020 using the trending hashtags during the time of pandemic. We relied on two fact-checkers: the France-Press Agency and the Saudi Anti-Rumors Authority to extract a list of keywords related to the misinformation and fake news topics. A small corpus was extracted from the collected tweets and manually annotated into fake or genuine classes. We used a set of features extracted from tweet contents to train a set of machine learning classifiers. The manually annotated corpus was used as a baseline to build a system for automatically detecting fake news from Arabic text. Classification of the manually annotated dataset achieved an F1-score of 87.8% using Logistic Regression (LR) as a classifier with the n-gram-level Term Frequency-Inverse Document Frequency (TF-IDF) as a feature, and a 93.3% F1-score on the automatically annotated dataset using the same classifier with count vector feature. The introduced system and datasets could help governments, decision-makers, and the public judge the credibility of information published on social media during the COVID-19 pandemic.

Highlights

  • The rise of social networks such as Facebook, Twitter, and many others has enabled the rapid spread of information

  • We address the problem of fake news detection on Twitter during the COVID-19 pandemic period

  • The proposed model is capable of detecting rumors based on a tweet’s text, and experiments showed that the proposed model outperforms state-of-the-art classifiers

Read more

Summary

Introduction

The rise of social networks such as Facebook, Twitter, and many others has enabled the rapid spread of information. Twitter is one of the most popular social media platforms. It is designed to allow users to send information as short texts, known as tweets, with no more than 280 characters, and each user on Twitter can follow as many accounts as he or she wants. The spread of misinformation about COVID-19 symptoms may harm people [1]. It could be anxiety-inducing for a person who experiences COVID-19 like symptoms even if they have not been infected with the virus. The terms fake news and misinformation are closely related and are often

Objectives
Methods
Findings
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call