Arabic Offensive and Hate Speech Detection Using a Cross-Corpora Multi-Task Learning Model

Wassen Aldjanabi,Ahmed Mohamed Helmi,Mohamed Abd Elaziz,Abdelghani Dahou,Mohammed A A Al-Qaness,Robertas Damaševičius

doi:10.3390/informatics8040069

Abstract

As social media platforms offer a medium for opinion expression, social phenomena such as hatred, offensive language, racism, and all forms of verbal violence have increased spectacularly. These behaviors do not affect specific countries, groups, or communities only, extending beyond these areas into people’s everyday lives. This study investigates offensive and hate speech on Arab social media to build an accurate offensive and hate speech detection system. More precisely, we develop a classification system for determining offensive and hate speech using a multi-task learning (MTL) model built on top of a pre-trained Arabic language model. We train the MTL model on the same task using cross-corpora representing a variation in the offensive and hate context to learn global and dataset-specific contextual representations. The developed MTL model showed a significant performance and outperformed existing models in the literature on three out of four datasets for Arabic offensive and hate speech detection tasks.

Highlights

In recent years, the use of the social networks has substantially increased in the Arab world
multi-task learning (MTL)-A-L and MTL-A-T: are MTL models with AraBERT used in the shared part, and open-source Arabic corpora and corpora processing tools (OSACT)-offensive language detection (OFF), OSACT-hate speech (HS), and T-HSAB are used in the specific task part;
MTL-M-L and MTL-M-T: are MTL models with MarBERT covering Maghreb region dialect and modern standard Arabic (MSA) used in the shared part, and OSACT-OFF, OSACT-HS, and T-HSAB are used in the specific task part;

Summary

Introduction

The use of the social networks has substantially increased in the Arab world. It has allowed more freedom for opinion expression, especially in the political domain. Due to the freedom of speech given to social media users, it has become relatively easy to propagate abusive or hate speech towards individuals, groups, or societies. Online hate speech is characterized as the use of an offensive language, aimed at a specific group of people who share some common trait [1], while social networks have been recognized as a very favorable medium often used for planning and executing hate attack related activities [2]. It is important to detect such cases of cyber-aggression and cyber-bullying in good time [4]

Methods

Results

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Informatics	Publication Date: Oct 8, 2021
Citations: 40	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Arabic Offensive and Hate Speech Detection Using a Cross-Corpora Multi-Task Learning Model

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Informatics

Lead the way for us

Similar Papers

Sinhala Hate Speech Detection in Social Media using Text Mining and Machine learning
H.M.S.T Sandaruwan ... M.A.L Kalyani
-
H.M.S.T Sandaruwan, et. al.H.M.S.T Sandaruwan ... M.A.L Kalyani
01 Sep 2019
01 Sep 2019

Evaluating Machine Learning Techniques for Detecting Offensive and Hate Speech in South African Tweets
Oluwafemi Oriola ... Eduan Kotze
IEEE Access | VOL. 8
Oluwafemi Oriola, et. al.Oluwafemi Oriola ... Eduan Kotze
01 Jan 2020
IEEE Access | VOL. 8

Interpretable and High-Performance Hate and Offensive Speech Detection
Marzieh Babaeianjelodar ... Stephen Lorenz
-
Marzieh Babaeianjelodar, et. al.Marzieh Babaeianjelodar ... Stephen Lorenz
01 Jan 2021
01 Jan 2021

Detection of Hate and Offensive Speech in Text
Abid Hussain Wani ... Nahida Shafi Molvi
-
Abid Hussain Wani, et. al.Abid Hussain Wani ... Nahida Shafi Molvi
01 Jan 2020
01 Jan 2020

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Arabic Offensive and Hate Speech Detection Using a Cross-Corpora Multi-Task Learning Model

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Informatics