Abstract

AbstractAs online social media content continues to grow, so does the spread of hate speech. Hate speech has devastating consequences unless it is detected and monitored early. Recently, deep neural network-based hate speech detection models, particularly conventional single-channel Convolutional Neural Network (CNN), have achieved remarkable performance. However, the effectiveness of the models depends on the type of language they are trained on and the training data size. We argue that the effectiveness of the models could further be enhanced if we use multi-channel CNN models even for under-resourced languages that have limited training data size. This is because the single-channel CNN might fail to consider the potential effect of multiple channels to generate better features, which is not well investigated for hate speech detection. Therefore, in this work, we explore the use of multi-channel CNN to extract better features from different channels in an end-to-end manner on top of a word2vec embedding layer. Tested on a new small-scale Amharic hate speech dataset containing 2000 annotated social media comments, the experimental results show that the proposed multi-channel CNN model outperforms the single-channel CNN models but underperform from the baseline Support Vector Machine (SVM) with an average F-score of 81.3%, 78.2%, and 92.5% respectively. The finding of the study implies that the proposed MC-CNN model can be used as an alternative solution for hate speech detection using a deep learning approach when dataset scarcity is an issue.KeywordsSocial mediaDeep learningWord embeddingAmharic hate speech detectionSingle-channelMulti-channelConvolutional neural network

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call