Multi-label Hate Speech and Abusive Language Detection in Indonesian Twitter

Muhammad Okky Ibrohim,Indra Budi

doi:10.18653/v1/w19-3506

Abstract

Hate speech and abusive language spreading on social media need to be detected automatically to avoid conflict between citizen. Moreover, hate speech has a target, category, and level that also needs to be detected to help the authority in prioritizing which hate speech must be addressed immediately. This research discusses multi-label text classification for abusive language and hate speech detection including detecting the target, category, and level of hate speech in Indonesian Twitter using machine learning approach with Support Vector Machine (SVM), Naive Bayes (NB), and Random Forest Decision Tree (RFDT) classifier and Binary Relevance (BR), Label Power-set (LP), and Classifier Chains (CC) as the data transformation method. We used several kinds of feature extractions which are term frequency, orthography, and lexicon features. Our experiment results show that in general RFDT classifier using LP as the transformation method gives the best accuracy with fast computational time.

Highlights

Hate speech is a direct or indirect speech toward a person or group containing hatred based on something inherent to that person or group (Komnas HAM, 2015
In Indonesia, abusive words are usually derived from an unpleasant condition such as mental disorder, sexual deviation, physical disability, lack of modernization, a condition where someone does not have etiquette, conditions that is not allowed by religion, and other conditions related to unfortunate circumstances; animals that have a bad characteristic, disgusting, and forbidden in certain religion; astral beings that often interfere with human life; a dirty and bad smell object; a part of the body and an activity that related to sexual activity; and low-class profession that is forbidden by religion (Wijana and Rohmadi., 2010; Ibrohim and Budi, 2018)
The best performance in experiments using the combination of best features is obtained when using Random Forest Decision Tree (RFDT) classifier with Label Power-set (LP) data transformation method using the combination of character quadgrams, question mark, and negative sentiment just gives 65.73% of accuracy, still cannot exceed the accuracy given by the RFDT classifier with LP data transformation method using word unigram feature that can give 66.12% of accuracy

Summary

Introduction

Hate speech is a direct or indirect speech toward a person or group containing hatred based on something inherent to that person or group (Komnas HAM, 2015). Based on our literature study, there has been no research on abusive language and hate speech detection including the detection of hate speech target, category, and level conducted simultaneously. We built an Indonesian Twitter dataset for abusive language and hate speech detection including detecting the target, category, and level of hate speech. Building a dataset for abusive language and hate speech detection including detecting the target, category, and level of hate speech in Indonesian Twitter. Conducting preliminaries experiments on multi-label abusive language and hate speech detection (including hate speech target, category, and level detection) in Indonesian Twitter using machine learning approaches. 3. Strong hate speech, which is hate speech in the form of swearing/slanders/blasphemy/stereotyping/labeling aimed at individual or group including incitement/provocation to bring open conflict. This kind of hate speech is belonging to strong hate speech, because it is a hate speech that needs to be prioritized to be resolved soon because it can invite conflicts that are widespread and can lead to conflicts/physical destruction in the real world

Data Collection and Annotation

Experiments and Discussions

First Scenario Experiment Result

Second Scenario Experiment Result

Discussions

Findings

Conclusions and Future Works