A Comparative Study of Deep Learning Methods for Hate Speech and Offensive Language Detection in Textual Data

Rohit Sinha,Rohan Kumar Gupta,Parth Bajaj,Yogesh Yadav

doi:10.1109/indicon52576.2021.9691704

Abstract

The problem of hate speech on social network sites is very prevalent which is being faced by every major social media platform. Several methods have been explored for the purpose of intent-based text classification. Each method has its own pros and cons concerning the type of intent, size of data set, the maximum length of text, etc. Several approaches have been presented in the literature for the hate and offensive speech detection. The main objective of this work is to present a comparative study among select deep learning methods for hate speech and offensive language detection. These methods include recurrent neural network (RNN), convolutional neural network (CNN), long shortterm memory (LSTM) and bidirectional encoder representations from transformer (BERT). We have investigated the effect of class weighting technique on the performance of the deep learning methods. Our study finds that the pre-trained BERT model outperforms the other explored models in case of both unweighted and weighted hate speech classification. For offensive language classification, RNN and CNN model outperforms all other models in case of unweighted and weighted respectively. It came out that, the class weighting technique has considerably boost the classification performance of all four models for hate speech.

Full Text