The Effect of Pre-Processing on the Classification of Twitter’s Flood Disaster Messages Using Support Vector Machine Algorithm

Mera Kartika Delimayanti,Risna Sari,Mauldy Laya,Pahrul Pahrul,Rizqi Fitri Naryanto,M Reza Faisal

doi:10.1109/icae50557.2020.9350387

Mera Kartika Delimayanti, Risna Sari + Show 4 more

https://doi.org/10.1109/icae50557.2020.9350387

Copy DOI

Abstract

The eyewitness message on twitter as a social network sensor aims to determine the classification process's performance. In the classification of flood disaster messages, preprocessing data is required before the classification process is carried out. Preprocessing affects the resulting level of accuracy in the classification process. Stopword removal is part of preprocessing so that the effect of stopword removal on classification performance will be examined. Support Vector Machine (SVM) is used to classify by weighting words using Term Frequency-Inverse Document Frequency (TF-IDF). The data taken from Twitter is 3000 data with 1000 data each for each label. The effect of the stopword on accuracy performance can be seen in several experiments that have been carried out. We had already conducted three different experiments, and the highest level of accuracy was 76.6%, 76.87%, and 77.87%. Based on the experiments that have been carried out, stopword is very influential on the accuracy generated by the classification of flood disaster messages on Twitter.

Full Text