Abstract

The popularity and widespread use of social media are constantly generating unmonitored data, spreading unwanted content such as hate speech and expressions that incite violence. Automatic detection of violence incitation is a challenging task and to the best of our knowledge, Urdu language has been completely neglected. Therefore, a robust framework is proposed for identifying expressions exhibiting violence incitation in Urdu tweets. The potentials of the semantic, word embeddings, and language models are explored to learn contextualized representations of the violence incitation in Urdu tweets. In addition, the strength of the 1-Dimensional Convolutional Neural Network (1D-CNN) is exploited by tunning its parameters on the newly proposed annotated Urdu corpus. The annotated dataset consists of 4808 tweets manually collected from Pakistani Twitter accounts. The performance of 1D-CNN with word uni-gram, Urdu Bidirectional Encoder Representations from Transformer (Urdu-BERT), and Urdu- Robustly Optimized BERT Approach (Urdu-RoBERTa) models is compared to fine-tuned Urdu-RoBERTa, Bidirectional Long short-term memory (BiLSTM), Convolutional BiLSTM (CBi-LSTM), and six state-of-the-art Machine Learning (ML) models. The results reveal that the 1D-CNN with word uni-gram model shows benchmark performance by demonstrating 89.84% accuracy and 89.80% macro f1-score. Furthermore, it outperforms all comparable models and achieves 89.76% f1-score for the violence class, and 89.84% f1-score for not-violence class identification. The uniqueness of the proposed model is evaluated using MARS shine-through and MARS occlusion metrics and the CNN model outperformed the others. The MARS metrics facilitate evaluation and visualization of the classifier performance in terms of capturing unique true positive samples that are not predicted by other models. The findings of the proposed framework are very supportive for further investigation in this domain.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call