Abstract

The state-of-the-art abusive language detection models report great in-corpus performance, but underperform when evaluated on abusive comments that differ from the training scenario. As human annotation involves substantial time and effort, models that can adapt to newly collected comments can prove to be useful. In this paper, we investigate the effectiveness of several Unsupervised Domain Adaptation (UDA) approaches for the task of cross-corpora abusive language detection. In comparison, we adapt a variant of the BERT model, trained on large-scale abusive comments, using Masked Language Model (MLM) fine-tuning. Our evaluation shows that the UDA approaches result in sub-optimal performance, while the MLM fine-tuning does better in the cross-corpora setting. Detailed analysis reveals the limitations of the UDA approaches and emphasizes the need to build efficient adaptation methods for this task.

Highlights

  • Gies, targets of abuse, abusive language forms, etc

  • Our evaluation shows that the Unsupervised Domain Adaptation (UDA) approaches result in sub-optimal performance, while the Masked Language Model (MLM) fine-tuning does better in the cross-corpora setting

  • A task related to abuse detection is sentiment classification (Bauwelinck and Lefever, 2019; Rajamanickam et al, 2020), and it involves an extensive body of work on domain adaptation

Read more

Summary

Introduction

Gies, targets of abuse, abusive language forms, etc These call for approaches that can adapt to newly seen content out of the original training corpus. Our evaluation shows that the UDA approaches result in sub-optimal performance, while the MLM fine-tuning does better in the cross-corpora setting. Given an automatic text classification or tagging task, such as abusive language detection, a corpus with coherence can be considered a domain (Ramponi and Plank, 2020; Plank, 2011). Under this condition, domain adaptation approaches can be applied in cross-corpora evaluation setups. To the best of our religious minorities are the other targeted groups knowledge, this is the first work that analyzes in Waseem and Hovy (2016), while African Ameri-

Objectives
Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call