Abstract

Hate speech is an important problem in the management of user-generated content. To remove offensive content or ban misbehaving users, content moderators need reliable hate speech detectors. Recently, deep neural networks based on the transformer architecture, such as the (multilingual) BERT model, have achieved superior performance in many natural language classification tasks, including hate speech detection. So far, these methods have not been able to quantify their output in terms of reliability. We propose a Bayesian method using Monte Carlo dropout within the attention layers of the transformer models to provide well-calibrated reliability estimates. We evaluate and visualize the results of the proposed approach on hate speech detection problems in several languages. Additionally, we test whether affective dimensions can enhance the information extracted by the BERT model in hate speech classification. Our experiments show that Monte Carlo dropout provides a viable mechanism for reliability estimation in transformer networks. Used within the BERT model, it offers state-of-the-art classification performance and can detect less trusted predictions.

Highlights

  • With the rise of social network popularity, hate speech phenomena have significantly increased [22]

  • We proposed to use the Monte Carlo Dropout (MCD) in the attention layers of transformer neural networks, and to unfreeze dropout layers during the prediction phase

  • This resulted in two new architectures, Bayesian attention networks (BANs) and MCD BERT

Read more

Summary

Introduction

With the rise of social network popularity, hate speech phenomena have significantly increased [22]. (automated) hate speech detection mechanisms are urgently needed. In the last few years, recurrent neural networks (RNNs) were the most popular text classification choice. Long Short-Term Memory (LSTM) networks, the most successful RNN architecture, were already successfully adapted for the assessment of predictive reliability in hate speech classification [7]. Neural network architecture with attention layers, called ‘transformer architecture’ [6], have showed even better performance on almost all language processing tasks. Using transformer networks for masked language modeling produced breakthrough pretrained models, such as BERT (Bidirectional Encoder Representations from Transformers) [43]. The attention mechanism, which is a crucial part of transformer networks, became an essential

Objectives
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call