Abstract

Online abusive language detection (ALD) has become a societal issue of increasing importance in recent years. Several previous works in online ALD focused on solving a single abusive language problem in a single domain, like Twitter, and have not been successfully transferable to the general ALD task or domain. In this paper, we introduce a new generic ALD framework, MACAS, which is capable of addressing several types of ALD tasks across different domains. Our generic framework covers multi-aspect abusive language embeddings that represent the target and content aspects of abusive language and applies a textual graph embedding that analyses the user's linguistic behaviour. Then, we propose and use the cross-attention gate flow mechanism to embrace multiple aspects of abusive language. Quantitative and qualitative evaluation results show that our ALD algorithm rivals or exceeds the six state-of-the-art ALD algorithms across seven ALD datasets covering multiple aspects of abusive language and different online community domains.

Highlights

  • Abusive language in online communities has become a significant societal problem (Nobata et al, 2016) and online abusive language detection (ALD) aims to identify any type of insult, vulgarity, or profanity that debases a target or group online

  • “What would be the best generic ALD model that can be used for different types of abusive language detection sub-tasks and in different online communities?” To solve this, we found that Waseem et al (2017) reviewed the existing online abusive language detection literature, and defined a generic abusive language typology that can encompass the targets of a wide range of abusive language subtasks in different types of domain

  • The evaluation shows that most of the state-of-the-art ALD algorithms do not generalise their model to different types of abusive language problems or datasets

Read more

Summary

Introduction

Abusive language in online communities has become a significant societal problem (Nobata et al, 2016) and online abusive language detection (ALD) aims to identify any type of insult, vulgarity, or profanity that debases a target or group online. “What would be the best generic ALD model that can be used for different types of abusive language detection sub-tasks and in different online communities?” To solve this, we found that Waseem et al (2017) reviewed the existing online abusive language detection literature, and defined a generic abusive language typology that can encompass the targets of a wide range of abusive language subtasks in different types of domain. The typology is categorised in the following two aspects: 1) Target aspect: The abuse can be directed towards either a) a specific individual/entity or b) a generalised group This is an essential sociological distinction as the latter refers to a whole category of people, like a race or gender, rather than a specific individual or organisation; 2) Content aspect: The abusive content can be explicit or implicit. Explicit abuse is unambiguous in its potential to be damaging, while implicit abusive language does not immediately

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call