AI-Based Hate Speech Detection in Albanian Social Media: New Dataset and Mobile Web Application Integration

Endrit Fetahi,Mentor Hamiti,Arsim Susuri,Jaumin Ajdari,Xhemal Zenuni

doi:10.3991/ijim.v18i24.50851

Abstract

This paper aims to advance AI-based hate speech (HS) detection in the Albanian language, which is resource-limited in natural language processing (NLP). Addressing the challenge of limited data, we developed a human-annotated dataset of over 11,000 comments, carefully curated from various Albanian social media platforms, containing a substantial number of HS instances. The dataset was annotated using a detailed two-layer taxonomy to capture the complex dimensions of HS. To ensure high-quality annotations, three expert annotators applied a majority voting system, achieving a substantial Fleiss’s kappa coefficient of 0.62, underscoring the reliability and consistency of the annotations. We conducted a comparative analysis of several machine learning (ML) algorithms, including support vector machine (SVM), Naïve Bayes (NB), XGBoost, and random forest (RF), paired with various text vectorisation techniques and pre-processing methods. In binary classification, the NB model with term frequencyinverse document frequency (TF-IDF) vectorization achieved the highest performance, with an F1 score of 0.80. For multiclass classification, XGBoost outperformed other models, achieving an F1 score of 0.77. Interestingly, our experiments revealed that pre-processing steps generally reduced model performance, suggesting that raw text inputs work better for the Albanian language. Through error analysis using local interpretable model-agnostic explanations (LIME), we identified key challenges, such as polysemy and irony, which contributed to misclassifications. To demonstrate the practical applicability of our work, we developed a user-friendly mobile web application based on the best-performing model, providing realtime HS detection with the potential for integration into social media platforms.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

AI-Based Hate Speech Detection in Albanian Social Media: New Dataset and Mobile Web Application Integration

Abstract

Talk to us

Similar Papers

More From: International Journal of Interactive Mobile Technologies (iJIM)

Lead the way for us

Journal: International Journal of Interactive Mobile Technologies (iJIM)	Publication Date: Dec 17, 2024
License type: CC BY 4.0

Similar Papers

Enhancing the Quality of Educational Sciences Programs in Jordanian Universities According to CAEP Standards
Zaid Khrisat
International Journal of Interactive Mobile Technologies (iJIM) | VOL. 18
Zaid KhrisatZaid Khrisat
17 Dec 2024
International Journal of Interactive Mobile Technologies (iJIM) | VOL. 18

AI-Based Hate Speech Detection in Albanian Social Media: New Dataset and Mobile Web Application Integration
Endrit Fetahi ... Xhemal Zenuni
International Journal of Interactive Mobile Technologies (iJIM) | VOL. 18
Endrit Fetahi, et. al.Endrit Fetahi ... Xhemal Zenuni
17 Dec 2024
International Journal of Interactive Mobile Technologies (iJIM) | VOL. 18

The Relation among Vocational Teachers’ Use of Digital and Mobile Tools and Socio-Demographic Factors
Kateřina Berková ... Tereza Vacínová
International Journal of Interactive Mobile Technologies (iJIM) | VOL. 18
Kateřina Berková, et. al.Kateřina Berková ... Tereza Vacínová
17 Dec 2024
International Journal of Interactive Mobile Technologies (iJIM) | VOL. 18

Mobile Application Design Oriented to Students with Deaf Mute Disabilities
Elizabeth Liñan-Espinoza ... Laberiano Andrade-Arenas
International Journal of Interactive Mobile Technologies (iJIM) | VOL. 18
Elizabeth Liñan-Espinoza, et. al.Elizabeth Liñan-Espinoza ... Laberiano Andrade-Arenas
17 Dec 2024
International Journal of Interactive Mobile Technologies (iJIM) | VOL. 18

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

AI-Based Hate Speech Detection in Albanian Social Media: New Dataset and Mobile Web Application Integration

Abstract

Talk to us

Similar Papers

More From: International Journal of Interactive Mobile Technologies (iJIM)