Cross-Lingual Few-Shot Hate Speech and Offensive Language Detection Using Meta Learning

Marzieh Mozafari,Noel Crespi,Reza Farahbakhsh

doi:10.1109/access.2022.3147588

Abstract

Automatic detection of abusive online content such as hate speech, offensive language, threats, etc. has become prevalent in social media, with multiple efforts dedicated to detecting this phenomenon in English. However, detecting hatred and abuse in low-resource languages is a non-trivial challenge. The lack of sufficient labeled data in low-resource languages and inconsistent generalization ability of transformer-based multilingual pre-trained language models for typologically diverse languages make these models inefficient in some cases. We propose a meta learning-based approach to study the problem of few-shot hate speech and offensive language detection in low-resource languages that will allow hateful or offensive content to be predicted by only observing a few labeled data items in a specific target language. We investigate the feasibility of applying a meta learning approach in cross-lingual few-shot hate speech detection by leveraging two meta learning models based on optimization-based and metric-based (MAML and Proto-MAML) methods. To the best of our knowledge, this is the first effort of this kind. To evaluate the performance of our approach, we consider hate speech and offensive language detection as two separate tasks and make two diverse collections of different publicly available datasets comprising 15 datasets across 8 languages for hate speech and 6 datasets across 6 languages for offensive language. Our experiments show that meta learning-based models outperform transfer learning-based models in a majority of cases, and that Proto-MAML is the best performing model, as it can quickly generalize and adapt to new languages with only a few labeled data points (generally, 16 samples per class yields an effective performance) to identify hateful or offensive content.

Highlights

T HE proliferation of social media platforms (e.g., Twitter, Facebook, and Instagram) changes the way people communicate with each other
The recent advancements in Natural Language Processing (NLP), Machine Learning (ML), and Deep Learning (DL) have enabled research communities to develop a variety of automatic hate speech detection methods [1,2,3,4,5,6], where, in general, hate speech is defined as any type of communication that is abusive, insulting, intimidating, harassing, and/or inciting violence or discrimination, disparaging a person or a vulnerable group based on some protected characteristics, e.g., gender, sexual orientation, religion, ethnicity, race, etc
Several studies have investigated the multilingual classification of hate speech and offensive language using multilingual, cross-lingual, or joint-learning approaches.We summarize the works in multilingual and cross lingual, below

Summary

Introduction

T HE proliferation of social media platforms (e.g., Twitter, Facebook, and Instagram) changes the way people communicate with each other. Concerns are growing that they enable abusive behaviors, e.g., threatening or harassing other users, cyberbullying, hate speech, racial and sexual discrimination, as well. Given the high progression of online hate speech and its severe negative effects, institutions, social media platforms, and researchers have been trying to react as quickly as possible. The recent advancements in Natural Language Processing (NLP), Machine Learning (ML), and Deep Learning (DL) have enabled research communities to develop a variety of automatic hate speech detection methods [1,2,3,4,5,6], where, in general, hate speech is defined as any type of communication that is abusive, insulting, intimidating, harassing, and/or inciting violence or discrimination, disparaging a person or a vulnerable group based on some protected characteristics, e.g., gender, sexual orientation, religion, ethnicity, race, etc. The introduction of transformer-based models, most notably BERT

Objectives

Methods

Findings

Conclusion