On the effects of machine translation on offensive language detection

Alphaeus Dmonte,Shrey Satapara,Rehab Alsudais,Tharindu Ranasinghe,Marcos Zampieri

doi:10.1007/s13278-024-01398-4

Alphaeus Dmonte, Shrey Satapara + Show 3 more

Open Access

https://doi.org/10.1007/s13278-024-01398-4

Copy DOI

Export

Save

Cite

Journal: Social Network Analysis and Mining	Publication Date: Jan 9, 2025
License type: cc-by

Abstract
Full-Text
Similar Papers

Abstract

Listen

Machine translation (MT) is widely used to translate content on social media platforms aiming to improve accessibility. A great part of the content circulated on social media is user-generated and often contains non-standard spelling, hashtags, and emojis that pose challenges to MT systems. This leads to many mistranslated instances that are presented to users of these platforms, hindering their understanding of content written in other languages. In this paper, we investigate the impact of MT on offensive language identification. We pose that MT and potential mistranslations have an important and mostly under-explored impact on social media tasks such as sentiment analysis and offensive language identification. We create MT-Offense, a novel dataset containing English originals and translations in Arabic, Hindi, Marathi, Sinhala, and Spanish produced by multiple open-access Neural Machine Translation systems. We evaluate the performance of various offensive language models on both original and MT content in different training and test set combinations. We report the F1 scores of the models. Our results show that (1) offensive language identification models perform better on original data than on MT data, and (2) the use of MT data in training helps models better identify offensive language in MT content compared to models trained exclusively on original data.

Full Text

Published Version

View

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

On the effects of machine translation on offensive language detection

Abstract

Published Version

Talk to us

Similar Papers

More From: Social Network Analysis and Mining

Lead the way for us

Similar Papers

Characterization and mechanical properties of offensive language taxonomy and detection techniques
S.V Kogilavani ... S Malliga
Materials Today: Proceedings | VOL. 81
S.V Kogilavani, et. al.S.V Kogilavani ... S Malliga
20 May 2021
Materials Today: Proceedings | VOL. 81

Trends in the use of information and communication technologies in the teaching of foreign languages and translator training
N Kovalchuk
Teaching languages at higher institutions | VOL. -
N KovalchukN Kovalchuk
31 May 2024
Teaching languages at higher institutions | VOL. -

An Evaluation of the Accuracy of the Machine Translation Systems of Social Media Language
Yasser Muhammad Naguib Sabtan ...
International Journal of Advanced Computer Science and Applications | VOL. 12
Yasser Muhammad Naguib Sabtan, et. al.Yasser Muhammad Naguib Sabtan ...
01 Jan 2020
International Journal of Advanced Computer Science and Applications | VOL. 12

SSN_NLP_MLRG at SemEval-2020 Task 12: Offensive Language Identification in English, Danish, Greek Using BERT and Machine Learning Approach
A Kalaivani ... Thenmozhi D
-
A Kalaivani, et. al.A Kalaivani ... Thenmozhi D
01 Jan 2020
01 Jan 2020

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

On the effects of machine translation on offensive language detection

Abstract

Published Version

Talk to us

Similar Papers

More From: Social Network Analysis and Mining