Algoritme decision tree untuk mendeteksi ujaran kebencian dan bahasa kasar multilabel pada Twitter berbahasa Indonesia

Fauzi Ihsan,Surya Agustian,Iwan Iskandar,Nazruddin Safaat Harahap

doi:10.14710/jtsiskom.2021.13907

Fauzi Ihsan, Surya Agustian + Show 2 more

Open Access

https://doi.org/10.14710/jtsiskom.2021.13907

Copy DOI

Abstract

Hate speech and abusive language are easily found in written communications in social media like Twitter. They often cause a dispute between parties, the victims, and the first who write the tweet. However, it is also difficult to distinguish whether a tweet contains hate speech and/or abusive language for those who take sides. This research aims to develop a method to classify the tweets into abusive and/or contain hate speech classes. If hate speech is detected, then the system will measure the hardness level of hatred. The dataset includes 13,126 real tweets data. Word embeddings are used for featuring text input. For the tweets classification, we use a Decision Tree algorithm. Some engineering of features and parameters tuning has improved the classification of the three classes: hate speech class, abusive words, and hate speech level. The lexicon feature in the Decision Tree classification produces the highest accuracy for detecting the three classes rather than engineering special features and textual features. The average accuracy of the three classes increased from 69.77 % to 70.48 % for the training-testing composition of 90:10, and another 69.35 % to 69.54 % for 80:20 respectively.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Jurnal Teknologi dan Sistem Komputer	Publication Date: Oct 31, 2021
Citations: 2	License type: CC BY-SA 4.0

R Discovery Prime

R Discovery Prime

Algoritme decision tree untuk mendeteksi ujaran kebencian dan bahasa kasar multilabel pada Twitter berbahasa Indonesia

Abstract

Talk to us

Similar Papers

More From: Jurnal Teknologi dan Sistem Komputer

Lead the way for us

Similar Papers

Identification of hate speech and abusive language on indonesian Twitter using the Word2vec, part of speech and emoji features
Muhammad Okky Ibrohim ... Muhammad Akbar Setiadi
-
Muhammad Okky Ibrohim, et. al.Muhammad Okky Ibrohim ... Muhammad Akbar Setiadi
15 Nov 2019
15 Nov 2019

Multi-label Hate Speech and Abusive Language Detection in Indonesian Twitter
Muhammad Okky Ibrohim ... Indra Budi
-
Muhammad Okky Ibrohim, et. al.Muhammad Okky Ibrohim ... Indra Budi
01 Jan 2019
01 Jan 2019

Separating Hate Speech from Abusive Language on Indonesian Twitter
Muhammad Amien Ibrahim ... Puguh Wahyu Prasetyo
-
Muhammad Amien Ibrahim, et. al.Muhammad Amien Ibrahim ... Puguh Wahyu Prasetyo
06 Jul 2022
06 Jul 2022

An Approach of Hate Speech Identification on Twitter Corpus
Kavita Kumari ... Anupam Jamatia
-
Kavita Kumari, et. al.Kavita Kumari ... Anupam Jamatia
01 Jan 2023
01 Jan 2023

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Algoritme decision tree untuk mendeteksi ujaran kebencian dan bahasa kasar multilabel pada Twitter berbahasa Indonesia

Abstract

Talk to us

Similar Papers

More From: Jurnal Teknologi dan Sistem Komputer