Topic Modeling based Text Classification Regarding Islamophobia using Word Embedding and Transformers Techniques

Ammar Saeed,Danish Khan,Talha Imran,Achyut Shankar,Muhammad Kamran,Hikmat Ullah Khan,Muhammad Attique Khan

doi:10.1145/3626318

Abstract

Islamophobia is a rising area of concern in the current era where Muslims face discrimination and receive negative perspectives towards their religion, Islam. Islamophobia is a type of racism that is being practiced by individuals, groups, and organizations worldwide. Moreover, the ease of access to social media platforms and their augmented usage has also contributed to spreading hate speech, false information, and negative opinions about Islam. In this research study, we focused to detect Islamophobic textual content shared on various social media platforms. We explored the state-of-the-art techniques being followed in text data mining and Natural Language Processing (NLP). Topic modelling algorithm Latent Dirichlet Allocation is used to find top topics. Then, word embedding approaches such as Word2Vec and Global Vectors for word representation (GloVe) are used as feature extraction techniques. For text classification, we utilized modern text analysis techniques of transformers-based Deep Learning algorithms named Bidirectional Encoders Representation from Transformers (BERT) and Generative Pre-Trained Transformer (GPT). For results comparison, we conducted an extensive empirical analysis of Machine Learning algorithms and Deep Learning using conventional textual features such as the Term Frequency-Inverse Document Frequency, N-gram, and Bag of words (BoW). The empirical based results evaluated using standard performance evaluation measures show that the proposed approach effectively detects the textual content related to Islamophobia. In the corpus of the study under Machine Learning models Support Vector Machine (SVM) performed best with an F1 score of 91%. The Transformer based core NLP models and the Deep Learning model Convolutional Neural Network (CNN) when combined with GloVe performed best among all the techniques except SVM with BoW. GPT, SVM when combined with BoW and BERT yielded the best F1 score of 92%, 92% and 91.9% respectively, while CNN performed slightly poor with an F1 score of 91%.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Topic Modeling based Text Classification Regarding Islamophobia using Word Embedding and Transformers Techniques

Abstract

Talk to us

Similar Papers

More From: ACM Transactions on Asian and Low-Resource Language Information Processing

Lead the way for us

Journal: ACM Transactions on Asian and Low-Resource Language Information Processing	Publication Date: Nov 1, 2023
Citations: 1

Similar Papers

Food Classification for Inflammation Recognition Through Ingredient Label Analysis: A Real NLP Case Study
Stefano Campese ... Davide Pozza
-
Stefano Campese, et. al.Stefano Campese ... Davide Pozza
25 Aug 2020
25 Aug 2020

Comparative analysis of machine learning algorithms for biomedical text document classification: A case study on cancer-related publications
Ekrem Kucuk ... Ipek Cicek
Medicine Science | International Medical Journal | VOL. 13
Ekrem Kucuk, et. al.Ekrem Kucuk ... Ipek Cicek
01 Jan 2024
Medicine Science | International Medical Journal | VOL. 13

Evaluating Sentiment Classification to Specify Polarity by Lexicon-Based and Machine Learning Approaches for COVID-19 Twitter Data Sets
A Sathya ... Dr M.S Mythili
JOURNAL OF ADVANCED APPLIED SCIENTIFIC RESEARCH | VOL. 5
A Sathya, et. al. A Sathya ... Dr M.S Mythili
24 Jul 2023
JOURNAL OF ADVANCED APPLIED SCIENTIFIC RESEARCH | VOL. 5

The Impact of Features Extraction on the Sentiment Analysis
Ravinder Ahuja ... Pratyush Ahuja
Procedia Computer Science | VOL. 152
Ravinder Ahuja, et. al.Ravinder Ahuja ... Pratyush Ahuja
01 Jan 2019
Procedia Computer Science | VOL. 152

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Topic Modeling based Text Classification Regarding Islamophobia using Word Embedding and Transformers Techniques

Abstract

Talk to us

Similar Papers

More From: ACM Transactions on Asian and Low-Resource Language Information Processing