Abstract

The massive growth of social webs offer opportunities to communicate with diverse languages, unstructured text, informal posts, misspelled contents and emojis. Social media users feel comfortable to express their emotions specially emotions with high intensity (hate speech) in their mother tongue. Hate speech in any form targets groups and individuals that may trigger antisocial activities, hate crimes, and terrorist acts. Bengali social media users use Bengali for posting implicit or indirect hate text. Existing Bengali hate speech detection research considers explicit hate speech detection but in actual hate is expressed more in implicit way. In order to detect both implicit and explicit hate speech from low resource content, social webs need highly efficient automated tools. Researchers applied discriminative learning approaches (i.e. SVM, MLP, CNN) to distinguish hate text with only clear-cut outcomes in detecting direct hate speech. The proposed novel Bengali hate speech detection model considers two parallel approaches: (i) It applies extended fuzzy SVM classifier for class imbalanced dataset (FSVMCIL) and multilingual BERT (mBERT) text embedding model to detect first hate label; (ii) Morphological analysis method to detect implicit and explicit hate content with the hate similarity (HS) scheme for second hate label. Linking both labeling methods, this research extracts contextual Bengali hate speech from informal text. This novel HS method considers Word2Vec word embedding model and Bengali hate lexicon. It also considers emoji to text conversion for efficient contextual analysis. This study also conducts extensive experiments for various categories with the Bengali hate speech dataset. It also evaluates the proposed model performance considering weighted F1 score, precision, recall and accuracy parameters. Results reveal significant improvement in Bengali hate speech detection with 2.35% increase in F1- score and 9.11 % increase in accuracy.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call