Evaluating Machine Learning Techniques for Detecting Offensive and Hate Speech in South African Tweets

Oluwafemi Oriola,Eduan Kotze

doi:10.1109/access.2020.2968173

Abstract

In recent times, South Africa has been witnessing insurgence of offensive and hate speech along racial and ethnic dispositions on Twitter. Popular among the South African languages used is English. Although, machine learning has been successfully used to detect offensive and hate speech in several English contexts, the distinctiveness of South African tweets and the similarities among offensive, hate and free speeches require domain-specific English corpus and techniques to detect the offensive and hate speech. Thus, we developed an English corpus from South African tweets and evaluated different machine learning techniques to detect offensive and hate speech. Character n-gram, word n-gram, negative sentiment, syntactic-based features and their hybrid were extracted and analyzed using hyper-parameter optimization, ensemble and multi-tier meta-learning models of support vector machine, logistic regression, random forest, gradient boosting algorithms. The results showed that optimized support vector machine with character n-gram performed best in detection of hate speech with true positive rate of 0.894, while optimized gradient boosting with word n-gram performed best in detection of hate speech with true positive rate of 0.867. However, their performances in detection of other threatening classes were poor. Multi-tier meta-learning models achieved the most consistent and balanced classification performance with true positive rates of 0.858 and 0.887 for hate speech and offensive speech, respectively as well as true positive rate of 0.646 for free speech and overall accuracy of 0.671. The error analysis showed that multi-tier meta-learning model could reduce the misclassification error rate of the optimized models by 34.26%.

Highlights

Social networks are among the most impactful innovations in the 21st century
The Stacking-based metaclassifier, with Support Vector Machine had the best performance of 79.8 accuracy and 0.45 F1-sore for the minority hate speech class compared to other ensemble classifiers such as plurality voting, mean probability voting, etc
We evaluated different hyper-parameter configurations of machine learning classifiers such as Logistic Regression (LogReg), Support Vector Machine (SVM), Random Forest (RF) and Gradient Boosting (GB) for optimal performance

Summary

INTRODUCTION

A popular social networking platform is Twitter, which allows subscribers to propagate information in the cyberspace using alphanumeric, special characters, hyperlinks, images, emoticons, and other icons. Machine learning has been used to classify and detect Twitter offensive and hate speech in contexts such as racial, sexist, misogyny, religious, refugee and immigrants. These have involved binary [3], multiclass classifications [4], [5] or both [6]. Offensive speech is defined is any fair or unfair expression that is not hate speech but discriminatory against a person or group of persons, while free speech is any expression that justifies the freedom of expressions’ right.

DISTINCTIVENESS OF SOUTH AFRICAN TWEETS

METHODOLOGY

TOKENIZATION AND DATA PREPROCESSING

PERFORMANCE METRICS

VIII. CONCLUSION