Detection of Hate Speech in COVID-19-Related Tweets in the Arab Region: Deep Learning and Topic Modeling Approach.

Raghad Alshalan,Shahad Alshalan,Hend Al-Khalifa,Duaa Alsaeed,Heyam Al-Baity

doi:10.2196/22609

Abstract

BackgroundThe massive scale of social media platforms requires an automatic solution for detecting hate speech. These automatic solutions will help reduce the need for manual analysis of content. Most previous literature has cast the hate speech detection problem as a supervised text classification task using classical machine learning methods or, more recently, deep learning methods. However, work investigating this problem in Arabic cyberspace is still limited compared to the published work on English text.ObjectiveThis study aims to identify hate speech related to the COVID-19 pandemic posted by Twitter users in the Arab region and to discover the main issues discussed in tweets containing hate speech.MethodsWe used the ArCOV-19 dataset, an ongoing collection of Arabic tweets related to COVID-19, starting from January 27, 2020. Tweets were analyzed for hate speech using a pretrained convolutional neural network (CNN) model; each tweet was given a score between 0 and 1, with 1 being the most hateful text. We also used nonnegative matrix factorization to discover the main issues and topics discussed in hate tweets.ResultsThe analysis of hate speech in Twitter data in the Arab region identified that the number of non–hate tweets greatly exceeded the number of hate tweets, where the percentage of hate tweets among COVID-19 related tweets was 3.2% (11,743/547,554). The analysis also revealed that the majority of hate tweets (8385/11,743, 71.4%) contained a low level of hate based on the score provided by the CNN. This study identified Saudi Arabia as the Arab country from which the most COVID-19 hate tweets originated during the pandemic. Furthermore, we showed that the largest number of hate tweets appeared during the time period of March 1-30, 2020, representing 51.9% of all hate tweets (6095/11,743). Contrary to what was anticipated, in the Arab region, it was found that the spread of COVID-19–related hate speech on Twitter was weakly related with the dissemination of the pandemic based on the Pearson correlation coefficient (r=0.1982, P=.50). The study also identified the commonly discussed topics in hate tweets during the pandemic. Analysis of the 7 extracted topics showed that 6 of the 7 identified topics were related to hate speech against China and Iran. Arab users also discussed topics related to political conflicts in the Arab region during the COVID-19 pandemic.ConclusionsThe COVID-19 pandemic poses serious public health challenges to nations worldwide. During the COVID-19 pandemic, frequent use of social media can contribute to the spread of hate speech. Hate speech on the web can have a negative impact on society, and hate speech may have a direct correlation with real hate crimes, which increases the threat associated with being targeted by hate speech and abusive language. This study is the first to analyze hate speech in the context of Arabic COVID-19–related tweets in the Arab region.

Highlights

Social media platforms such as Twitter provide valuable venues for information sharing, communication, and knowledge production
The analysis revealed that the majority of hate tweets (8385/11,743, 71.4%) contained a low level of hate based on the score provided by the convolutional neural network (CNN)
This study identified Saudi Arabia as the Arab country from which the most COVID-19 hate tweets originated during the pandemic

Summary

Introduction

Social media platforms such as Twitter provide valuable venues for information sharing, communication, and knowledge production. While there is no formal definition of hate speech, there is general agreement among scholars and service providers to define it as any language that attacks a person or a group based on a characteristic such as race, color, ethnicity, gender, sexual orientation, nationality, or religion [3]. This type of discriminatory and hateful speech can have a destructive impact on society, as it threatens the culture of coexistence and unity. Work investigating this problem in Arabic cyberspace is still limited compared to the published work on English text

Methods

Results

Discussion

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Journal of Medical Internet Research	Publication Date: Dec 8, 2020
Citations: 45	License type: cc-by

R Discovery Prime

R Discovery Prime

Detection of Hate Speech in COVID-19-Related Tweets in the Arab Region: Deep Learning and Topic Modeling Approach.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Journal of Medical Internet Research

Lead the way for us

Similar Papers

A Comparative Study of Deep Learning Methods for Hate Speech and Offensive Language Detection in Textual Data
Yogesh Yadav ... Parth Bajaj
-
Yogesh Yadav, et. al.Yogesh Yadav ... Parth Bajaj
19 Dec 2021
19 Dec 2021

HateGAN: Adversarial Generative-Based Data Augmentation for Hate Speech Detection
Rui Cao ... Roy Ka-Wei Lee
-
Rui Cao, et. al.Rui Cao ... Roy Ka-Wei Lee
01 Jan 2020
01 Jan 2020

Sinhala Hate Speech Detection in Social Media using Text Mining and Machine learning
H.M.S.T Sandaruwan ... S.A.S Lorensuhewa
-
H.M.S.T Sandaruwan, et. al.H.M.S.T Sandaruwan ... S.A.S Lorensuhewa
01 Sep 2019
01 Sep 2019

Evaluating Machine Learning Techniques for Detecting Offensive and Hate Speech in South African Tweets
Oluwafemi Oriola ... Eduan Kotze
IEEE Access | VOL. 8
Oluwafemi Oriola, et. al.Oluwafemi Oriola ... Eduan Kotze
01 Jan 2020
IEEE Access | VOL. 8

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Detection of Hate Speech in COVID-19-Related Tweets in the Arab Region: Deep Learning and Topic Modeling Approach.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Journal of Medical Internet Research