Ceasing hate withMoH: Hate Speech Detection in Hindi-English Code-Switched Language

Arushi Sharma ,Minni Jain ,Anubha Kabra

doi:10.48550/arxiv.2110.09393

Arushi Sharma , Minni Jain + Show 1 more

https://doi.org/10.48550/arxiv.2110.09393

Copy DOI

Export

Save

Cite

Publication Date: Oct 18, 2021

Abstract
Full-Text
Similar Papers

Abstract

Listen

Social media has become a bedrock for people to voice their opinions worldwide. Due to the greater sense of freedom with the anonymity feature, it is possible to disregard social etiquette online and attack others without facing severe consequences, inevitably propagating hate speech. The current measures to sift the online content and offset the hatred spread do not go far enough. One factor contributing to this is the prevalence of regional languages in social media and the paucity of language flexible hate speech detectors. The proposed work focuses on analyzing hate speech in Hindi-English code-switched language. Our method explores transformation techniques to capture precise text representation. To contain the structure of data and yet use it with existing algorithms, we developed MoH or Map Only Hindi, which means "Love" in Hindi. MoH pipeline consists of language identification, Roman to Devanagari Hindi transliteration using a knowledge base of Roman Hindi words. Finally, it employs the fine-tuned Multilingual Bert and MuRIL language models. We conducted several quantitative experiment studies on three datasets and evaluated performance using Precision, Recall, and F1 metrics. The first experiment studies MoH mapped text's performance with classical machine learning models and shows an average increase of 13% in F1 scores. The second compares the proposed work's scores with those of the baseline models and offers a rise in performance by 6%. Finally, the third reaches the proposed MoH technique with various data simulations using the existing transliteration library. Here, MoH outperforms the rest by 15%. Our results demonstrate a significant improvement in the state-of-the-art scores on all three datasets.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.

R Discovery Prime

Ceasing hate withMoH: Hate Speech Detection in Hindi-English Code-Switched Language

Abstract

Talk to us

Published Version

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

Ceasing hate withMoH: Hate Speech Detection in Hindi-English Code-Switched Language

Abstract

Talk to us

Published Version