Abstract

Detecting harmful content or hate speech on social media is a significant challenge due to the high throughput and large volume of content production on these platforms. Identifying hate speech in a timely manner is crucial in preventing its dissemination. We propose a novel stacked ensemble approach for detecting hate speech in English tweets. The proposed architecture employs an ensemble of three classifiers, namely support vector machine (SVM), logistic regression (LR), and XGBoost classifier (XGB), trained using word2vec and universal encoding features. The meta classifier, LR, combines the outputs of the three base classifiers and the features employed by the base classifiers to produce the final output. It is shown that the proposed architecture improves the performance of the widely used single classifiers as well as the standard stacking and classifier ensemble using majority voting. We also present results on the use of various combinations of machine learning classifiers as base classifiers. The experimental results from the proposed architecture indicated an improvement in the performance on all four datasets compared with the standard stacking, base classifiers, and majority voting. Furthermore, on three of these datasets, the proposed architecture outperformed all state-of-the-art systems.

Highlights

  • An undesirable side effect of the increase in social media usage has been the rapid growth of hate speech on these platforms

  • In the (LR, Naïve Bayes (NB), Random Forest (RF)) combination, in most cases, RF agrees with logistic regression (LR) or NB or both in wrong class predictions leading to the misclassification in the majority voting approach

  • On the Davidson, COVID-HATE, and ZeerakW datasets, the best results were obtained by the (SVM, LR, XGBoost classifier (XGB)), (LR, NB, RF), and (KNN, LR, NB) combinations, respectively

Read more

Summary

Introduction

An undesirable side effect of the increase in social media usage has been the rapid growth of hate speech on these platforms. Hate speech can be defined as an attack on a specific person or group based on race, ethnicity, religion, gender, age, disability, or sexual orientation. Each platform of social media has its definition of hate speech. All agree that hate speech attacks specific target groups based on some discriminating characteristic. Even though recent research on automatic detection of hate speech is well-presented in [1,2,3,4,5], to the best of our knowledge, no such system has been fully implemented yet. Facebook implemented a model in 2019 called RoBERTa to detect toxic posts, dependence on user reports for the detection of hate speech has not been eliminated yet

Objectives
Methods
Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call