Classification of Hate Comments on Twitter Using a Combination of Logistic Regression and Support Vector Machine Algorithm

Nabila Putri Damayanti,Putri Susi Sundari,Della Egyta Prameswari,Wiyanda Puspita

doi:10.52465/joiser.v2i1.229

Nabila Putri Damayanti, Putri Susi Sundari + Show 2 more

Open Access

https://doi.org/10.52465/joiser.v2i1.229

Copy DOI

Abstract

This research was conducted to increase accuracy in classifying sentences containing hate speech and non-hate speech on Twitter. This is important to do because, as technology develops, it also comes with negative impacts, one of which is hate speech. This classification is carried out using a combination of Logistic Regression (LR) and Support Vector Machine (SVM) methods. This combination is based on the ease of implementation and speed of LR as well as SVM's ability to handle more complex and non-linear data. In this context, LR is used to model the probability that a comment on Twitter contains hate elements or not. The model can then provide probability predictions for each class, and a threshold can be set to determine the final class. This research shows that combining these methods can build a good classification model with an accuracy of 96%.

Full Text