Abstract
Hate speech is an important problem in the management of user-generated content. To remove offensive content or ban misbehaving users, content moderators need reliable hate speech detectors. Recently, deep neural networks based on the transformer architecture, such as the (multilingual) BERT model, have achieved superior performance in many natural language classification tasks, including hate speech detection. So far, these methods have not been able to quantify their output in terms of reliability. We propose a Bayesian method using Monte Carlo dropout within the attention layers of the transformer models to provide well-calibrated reliability estimates. We evaluate and visualize the results of the proposed approach on hate speech detection problems in several languages. Additionally, we test whether affective dimensions can enhance the information extracted by the BERT model in hate speech classification. Our experiments show that Monte Carlo dropout provides a viable mechanism for reliability estimation in transformer networks. Used within the BERT model, it offers state-of-the-art classification performance and can detect less trusted predictions.
Highlights
With the rise of social network popularity, hate speech phenomena have significantly increased [22]
We proposed to use the Monte Carlo Dropout (MCD) in the attention layers of transformer neural networks, and to unfreeze dropout layers during the prediction phase
This resulted in two new architectures, Bayesian attention networks (BANs) and MCD BERT
Summary
With the rise of social network popularity, hate speech phenomena have significantly increased [22]. (automated) hate speech detection mechanisms are urgently needed. In the last few years, recurrent neural networks (RNNs) were the most popular text classification choice. Long Short-Term Memory (LSTM) networks, the most successful RNN architecture, were already successfully adapted for the assessment of predictive reliability in hate speech classification [7]. Neural network architecture with attention layers, called ‘transformer architecture’ [6], have showed even better performance on almost all language processing tasks. Using transformer networks for masked language modeling produced breakthrough pretrained models, such as BERT (Bidirectional Encoder Representations from Transformers) [43]. The attention mechanism, which is a crucial part of transformer networks, became an essential
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.