People are increasingly sharing and expressing their emotions using online social media platforms such as Twitter, Facebook, and YouTube. An abusive, hateful, threatening, and discriminatory act that makes discomfort targets gay, lesbian, transgender, or bisexual individuals is called Homophobia and Transphobia. Detecting these types of acts on social media is called Homophobia and Transphobia Detection. This task has recently gained interest among researchers. Identifying homophobic and transphobic content for under-resourced languages is a bit challenging task. There are no such resources for Malayalam and Hindi to categorize these types of content as far now. This paper presents a new high-quality dataset for detecting homophobia and transphobia in Malayalam and Hindi languages. Our dataset consists of 5,193 comments in Malayalam and 3,203 comments in Hindi. We also submitted the experiments performed with traditional machine learning and transformer-based deep learning models on the Malayalam, Hindi, English, Tamil, and Tamil-English datasets.
Read full abstract