Cyberbullying Text Identification based on Deep Learning and Transformer-based Language Models

Khalid Saifullah,Suhaima Jamal,Muhammad Ibrahim Khan,Iqbal H Sarker

doi:10.4108/eetinis.v11i1.4703

Khalid Saifullah, Suhaima Jamal + Show 2 more

Open Access

https://doi.org/10.4108/eetinis.v11i1.4703

Copy DOI

Abstract

In the contemporary digital age, social media platforms like Facebook, Twitter, and YouTube serve as vital channels for individuals to express ideas and connect with others. Despite fostering increased connectivity, these platforms have inadvertently given rise to negative behaviors, particularly cyberbullying. While extensive research has been conducted on high-resource languages such as English, there is a notable scarcity of resources for low-resource languages like Bengali, Arabic, Tamil, etc., particularly in terms of language modeling. This study addresses this gap by developing a cyberbullying text identification system called BullyFilterNeT tailored for social media texts, considering Bengali as a test case. The intelligent BullyFilterNeT system devised overcomes Out-of-Vocabulary (OOV) challenges associated with non-contextual embeddings and addresses the limitations of context-aware feature representations. To facilitate a comprehensive understanding, three non-contextual embedding models GloVe, FastText, and Word2Vec are developed for feature extraction in Bengali. These embedding models are utilized in the classification models, employing three statistical models (SVM, SGD, Libsvm), and four deep learning models (CNN, VDCNN, LSTM, GRU). Additionally, the study employs six transformer-based language models: mBERT, bELECTRA, IndicBERT, XML-RoBERTa, DistilBERT, and BanglaBERT, respectively to overcome the limitations of earlier models. Remarkably, BanglaBERT-based BullyFilterNeT achieves the highest accuracy of 88.04% in our test set, underscoring its effectiveness in cyberbullying text identification in the Bengali language.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: EAI Endorsed Transactions on Industrial Networks and Intelligent Systems	Publication Date: Feb 22, 2024
Citations: 1	License type: CC BY 3.0

R Discovery Prime

R Discovery Prime

Cyberbullying Text Identification based on Deep Learning and Transformer-based Language Models

Abstract

Talk to us

Similar Papers

More From: EAI Endorsed Transactions on Industrial Networks and Intelligent Systems

Lead the way for us

Similar Papers

Application of Transformer-Based Language Models to Detect Hate Speech in Social Media
Swapnanil Mukherjee ... Sujit Das
Journal of Computational and Cognitive Engineering | VOL. 2
Swapnanil Mukherjee, et. al.Swapnanil Mukherjee ... Sujit Das
17 Dec 2021
Journal of Computational and Cognitive Engineering | VOL. 2

Poet Attribution of Urdu Ghazals using Deep Learning
Iqra Siddiqui ... Abdul Samad
-
Iqra Siddiqui, et. al.Iqra Siddiqui ... Abdul Samad
22 Feb 2023
22 Feb 2023

Text classification models for the automatic detection of nonmedical prescription medication use from social media
Mohammed Ali Al-Garadi ... Yucheng Ruan
BMC Medical Informatics and Decision Making | VOL. 21
Mohammed Ali Al-Garadi, et. al.Mohammed Ali Al-Garadi ... Yucheng Ruan
26 Jan 2021
BMC Medical Informatics and Decision Making | VOL. 21

Classifying Drug Ratings Using User Reviews with Transformer-Based Language Models
Akhil Shiju ... Zhe He
-
Akhil Shiju, et. al.Akhil Shiju ... Zhe He
01 Jun 2022
01 Jun 2022

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Cyberbullying Text Identification based on Deep Learning and Transformer-based Language Models

Abstract

Talk to us

Similar Papers

More From: EAI Endorsed Transactions on Industrial Networks and Intelligent Systems