Predicting the Type and Target of Offensive Posts in Social Media

Shervin Malmasi,Noura Farra,Ritesh Kumar,Preslav Nakov,Marcos Zampieri,Sara Rosenthal

doi:10.18653/v1/n19-1144

Abstract

As offensive content has become pervasive in social media, there has been much research in identifying potentially offensive messages. However, previous work on this topic did not consider the problem as a whole, but rather focused on detecting very specific types of offensive content, e.g., hate speech, cyberbulling, or cyber-aggression. In contrast, here we target several different kinds of offensive content. In particular, we model the task hierarchically, identifying the type and the target of offensive messages in social media. For this purpose, we complied the Offensive Language Identification Dataset (OLID), a new dataset with tweets annotated for offensive content using a fine-grained three-layer annotation scheme, which we make publicly available. We discuss the main similarities and differences between OLID and pre-existing datasets for hate speech identification, aggression detection, and similar tasks. We further experiment with and we compare the performance of different machine learning models on OLID.

Highlights

Offensive content has become pervasive in social media and a serious concern for government organizations, online communities, and social media platforms
Waseem et al (2017) analyzed the similarities between different approaches proposed in previous work and argued that there was a need for a typology that differentiates between whether the language is directed towards a specific individual or entity, or towards a generalized group, and whether the abusive content is explicit or implicit
To the best of our knowledge, no prior work has explored the target of the offensive language, which might be important in many scenarios, e.g., when studying hate speech with respect to a specific target

Summary

Introduction

Offensive content has become pervasive in social media and a serious concern for government organizations, online communities, and social media platforms. Prior work has studied offensive language in Twitter (Xu et al, 2012; Burnap and Williams, 2015; Davidson et al, 2017; Wiegand et al, 2018), Wikipedia comments, and Facebook posts (Kumar et al, 2018). Previous studies have looked into different aspects of offensive language such as the use of abusive language (Nobata et al, 2016; Mubarak et al, 2017), (cyber-)aggression (Kumar et al, 2018), (cyber-)bullying (Xu et al, 2012; Dadvar et al, 2013), toxic comments, hate speech (Kwok and Wang, 2013; Djuric et al, 2015; Burnap and Williams, 2015; Davidson et al, 2017; Malmasi and Zampieri, 2017, 2018), and offensive language (Wiegand et al, 2018).

Objectives

Methods

Findings

Conclusion