Bilingual Cyber-aggression detection on social media using LSTM autoencoder

Kirti Kumari,Nripendra Pratap Rana,Jyoti Prakash Singh,Yogesh Kumar Dwivedi

doi:10.1007/s00500-021-05817-y

Abstract

Cyber-aggression is an offensive behaviour attacking people based on race, ethnicity, religion, gender, sexual orientation and other traits. It has become a major issue plaguing the online social media. In this research, we have developed a deep learning-based model to identify different levels of aggression (direct, indirect and no aggression) in a social media post in a bilingual scenario. The model is an autoencoder built using the LSTM network and trained with non-aggressive comments only. Any aggressive comment (direct or indirect) will be regarded as an anomaly to the system and will be marked as Overtly (direct) or Covertly (indirect) aggressive comment depending on the reconstruction loss by the autoencoder. The validation results on the dataset from two popular social media sites: Facebook and Twitter with bilingual (English and Hindi) data outperformed the current state-of-the-art models with improvements of more than 11% on the test sets of the English dataset and more than 6% on the test sets of the Hindi dataset.

Full Text