Hope Speech Detection for Dravidian Languages Using Cross-Lingual Embeddings with Stacked Encoder Architecture.

Arunima Sundar,Akshay Ramakrishnan,Avantika Balaji,Thenmozhi Durairaj

doi:10.1007/s42979-021-00943-8

Arunima Sundar, Akshay Ramakrishnan + Show 2 more

Open Access

https://doi.org/10.1007/s42979-021-00943-8

Copy DOI

Abstract

The task of hope speech detection has gained traction in the natural language processing field owing to the need for an increase in positive reinforcement online during the COVID-19 pandemic. Hope speech detection focuses on identifying texts among social media comments that could invoke positive emotions in people. Students and working adults alike posit that they experience a lot of work-induced stress further proving that there exists a need for external inspiration which in this current scenario, is mostly found online. In this paper, we propose a multilingual model, with main emphasis on Dravidian languages, to automatically detect hope speech. We have employed a stacked encoder architecture which makes use of language agnostic cross-lingual word embeddings as the dataset consists of code-mixed YouTube comments. Additionally, we have carried out an empirical analysis and tested our architecture against various traditional, transformer, and transfer learning methods. Furthermore a k-fold paired t test was conducted which corroborates that our model outperforms the other approaches. Our methodology achieved an F1-score of 0.61 and 0.85 for Tamil and Malayalam, respectively. Our methodology is quite competitive to the state-of-the-art methods. The code for our work can be found in our GitHub repository (https://github.com/arunimasundar/Hope-Speech-LT-EDI).

Full Text