Racial Bias in Hate Speech and Abusive Language Detection Datasets

Thomas Davidson,Ingmar Weber,Debasmita Bhattacharya

doi:10.18653/v1/w19-3504

Abstract

Technologies for abusive language detection are being developed and applied with little consideration of their potential biases. We examine racial bias in five different sets of Twitter data annotated for hate speech and abusive language. We train classifiers on these datasets and compare the predictions of these classifiers on tweets written in African-American English with those written in Standard American English. The results show evidence of systematic racial bias in all datasets, as classifiers trained on them tend to predict that tweets written in African-American English are abusive at substantially higher rates. If these abusive language detection systems are used in the field they will therefore have a disproportionate negative impact on African-American social media users. Consequently, these systems may discriminate against the groups who are often the targets of the abuse we are trying to detect.

Highlights

Recent work has shown evidence of substantial bias in machine learning systems, which is typically a result of bias in the training data
Our study focuses on racial bias in hate speech and abusive language detection datasets (Waseem, 2016; Waseem and Hovy, 2016; Davidson et al, 2017; Golbeck et al, 2017; Founta et al, 2018), all of which use data collected from Twitter
We train classifiers using each of the datasets and use a corpus of tweets with demographic information to compare how each classifier performs on tweets written in African-American English (AAE) versus Standard American English (SAE) (Blodgett et al, 2016)

Summary

Introduction

Recent work has shown evidence of substantial bias in machine learning systems, which is typically a result of bias in the training data. Machine learning models are currently being deployed in the field to detect hate speech and abusive language on social media platforms including Facebook, Instagram, and Youtube. The aim of these models is to identify abusive language that directly targets certain individuals or groups, people belonging to protected categories (Waseem et al, 2017). In most cases the bias decreases in magnitude when we condition on particular keywords which may indicate membership in negative classes, yet it still persists We expect that these biases will result in racial discrimination if classifiers trained on any of these datasets are deployed in the field

Objectives

Methods

Results

Discussion

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Racial Bias in Hate Speech and Abusive Language Detection Datasets

Abstract

Highlights

Summary

Talk to us

Similar Papers

Lead the way for us

Publication Date: Jan 1, 2019
Citations: 274	License type: cc-by

Similar Papers

Hate speech detection and racial bias mitigation in social media based on BERT model.
Marzieh Mozafari ... Noël Crespi
PLOS ONE | VOL. 15
Marzieh Mozafari, et. al.Marzieh Mozafari ... Noël Crespi
27 Aug 2020
PLOS ONE | VOL. 15

Multi-label Hate Speech and Abusive Language Detection in Indonesian Twitter
Muhammad Okky Ibrohim ... Indra Budi
-
Muhammad Okky Ibrohim, et. al.Muhammad Okky Ibrohim ... Indra Budi
01 Jan 2019
01 Jan 2019

Hate Speech and Abusive Language and Abusive Language Detection in Twitter using Machine Learning
Sakshi Dhatrak ... Sakshi Bodke
International Journal of Advanced Research in Science, Communication and Technology | VOL. -
Sakshi Dhatrak, et. al. Sakshi Dhatrak ... Sakshi Bodke
09 Mar 2024
International Journal of Advanced Research in Science, Communication and Technology | VOL. -

Enhancing Multi-Label Hate Speech and Abusive Language Detection on Indonesian Twitter Using Recurrent Neural Networks with Hyperparameter Tuning
Tri Pratiwi Handayani ... Hilmansyah Gani
Jurnal Ilmiah Teknik Mesin, Elektro dan Komputer | VOL. 3
Tri Pratiwi Handayani, et. al.Tri Pratiwi Handayani ... Hilmansyah Gani
30 Nov 2023
Jurnal Ilmiah Teknik Mesin, Elektro dan Komputer | VOL. 3

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Racial Bias in Hate Speech and Abusive Language Detection Datasets

Abstract

Highlights

Summary

Talk to us

Similar Papers