A Novel Stacked Ensemble for Hate Speech Recognition

Mona Khalifa A Aljero,Nazife Dimililer

doi:10.3390/app112411684

Abstract

Detecting harmful content or hate speech on social media is a significant challenge due to the high throughput and large volume of content production on these platforms. Identifying hate speech in a timely manner is crucial in preventing its dissemination. We propose a novel stacked ensemble approach for detecting hate speech in English tweets. The proposed architecture employs an ensemble of three classifiers, namely support vector machine (SVM), logistic regression (LR), and XGBoost classifier (XGB), trained using word2vec and universal encoding features. The meta classifier, LR, combines the outputs of the three base classifiers and the features employed by the base classifiers to produce the final output. It is shown that the proposed architecture improves the performance of the widely used single classifiers as well as the standard stacking and classifier ensemble using majority voting. We also present results on the use of various combinations of machine learning classifiers as base classifiers. The experimental results from the proposed architecture indicated an improvement in the performance on all four datasets compared with the standard stacking, base classifiers, and majority voting. Furthermore, on three of these datasets, the proposed architecture outperformed all state-of-the-art systems.

Highlights

An undesirable side effect of the increase in social media usage has been the rapid growth of hate speech on these platforms
In the (LR, Naïve Bayes (NB), Random Forest (RF)) combination, in most cases, RF agrees with logistic regression (LR) or NB or both in wrong class predictions leading to the misclassification in the majority voting approach
On the Davidson, COVID-HATE, and ZeerakW datasets, the best results were obtained by the (SVM, LR, XGBoost classifier (XGB)), (LR, NB, RF), and (KNN, LR, NB) combinations, respectively

Summary

Introduction

An undesirable side effect of the increase in social media usage has been the rapid growth of hate speech on these platforms. Hate speech can be defined as an attack on a specific person or group based on race, ethnicity, religion, gender, age, disability, or sexual orientation. Each platform of social media has its definition of hate speech. All agree that hate speech attacks specific target groups based on some discriminating characteristic. Even though recent research on automatic detection of hate speech is well-presented in [1,2,3,4,5], to the best of our knowledge, no such system has been fully implemented yet. Facebook implemented a model in 2019 called RoBERTa to detect toxic posts, dependence on user reports for the detection of hate speech has not been eliminated yet

Objectives

Methods

Results

Discussion

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Applied sciences	Publication Date: Dec 9, 2021
Citations: 5	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

A Novel Stacked Ensemble for Hate Speech Recognition

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Applied sciences

Lead the way for us

Similar Papers

Detection of visual faults in photovoltaic modules using a stacking ensemble approach
Naveen Venkatesh S ... Mohammadreza Aghaei
Heliyon | VOL. 10
Naveen Venkatesh S, et. al.Naveen Venkatesh S ... Mohammadreza Aghaei
01 Mar 2024
Heliyon | VOL. 10

Performance Comparison between Meta-classifier Algorithms for Heart Disease Classification
Nureen Afiqah Mohd Zaini ... Mohd Khalid Awang
International Journal of Advanced Computer Science and Applications | VOL. 13
Nureen Afiqah Mohd Zaini, et. al.Nureen Afiqah Mohd Zaini ... Mohd Khalid Awang
01 Jan 2021
International Journal of Advanced Computer Science and Applications | VOL. 13

Sentiment Analysis Against IndiHome and First Media Internet Providers Using Ensemble Stacking Method
Arya Rafif Muhammad Fikri ... Widi Astuti
Building of Informatics, Technology and Science (BITS) | VOL. 4
Arya Rafif Muhammad Fikri, et. al.Arya Rafif Muhammad Fikri ... Widi Astuti
28 Sep 2022
Building of Informatics, Technology and Science (BITS) | VOL. 4

A Credit Scoring Heterogeneous Ensemble Model Using Stacking and Voting
C J Anil Kumar ... B K Raghavendra
Indian Journal of Science and Technology | VOL. 15
C J Anil Kumar, et. al.C J Anil Kumar ... B K Raghavendra
21 Feb 2021
Indian Journal of Science and Technology | VOL. 15

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A Novel Stacked Ensemble for Hate Speech Recognition

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Applied sciences