Abstract

Many people have begun to use social media platforms due to the increased use of the Internet over the previous decade. It has a lot of benefits, but it also comes with a lot of risks and drawbacks, such as Hate speech. People in multilingual societies, such as India, frequently mix their native language with English while speaking, so detecting hate content in such bilingual code-mixed data has drawn the larger interest of the research community. The majority of previous work focuses on high-resource language such as English, but very few researchers have concentrated on the mixed bilingual data like Hinglish. In this study, we investigated the performance of transformer models like IndicBERT and multilingual Bidirectional Encoder Representation(mBERT), as well as transfer learning from pre-trained language models like ULMFiT and Bidirectional encoder Representation(BERT), to find hateful content in Hinglish. Also, Transformer-based Interpreter and Feature extraction model on Deep Neural Network (TIF-DNN), is proposed in this work. The experimental results found that our proposed model outperforms existing state-of-art methods for Hate speech identification in Hinglish language with an accuracy of 73%.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.