Combining SentiStrength and Multilayer Perceptron in Twitter Sentiment Classification

Eko Yudhi Prastowo,Endroyono Endroyono,Eko Mulyanto Yuniarno

doi:10.1109/isitia.2019.8937134

Abstract

The advancement of internet technology has caused the use of social media to become the people lifestyle. The company and the government use social media as instant feedback to get user sentiments regarding their comments or reviews. The sentiment is an opinion or view that based on excessive feelings towards something. The method for knowing positive or negative sentiments from someone’s comments can be done manually by humans to analyze comments one by one or automatically by machine learning to do classifications. Machine learning requires training data and test data that have positive and negative labels. Generally, data labeling is done manually by humans. In this study, we used machine learning to classify sentiments with data collected from Twitter. Machine learning method used is Multilayer Perceptron and Naive Bayes as a comparison. Labeling dataset using manual method. For addition training data, labeled data was generated using an English lexicon dictionary called SentiStrength. Feature extraction uses vectorization and TF-IDF. This study aims to measure effect of adding training data generated using SentiStrength from unlabeled data during learning process to accuracy of machine learning model. Classification model testing uses data of 627 tweets. The result is addition of training data to increase average accuracy by 5% of initial accuracy. Multilayer Perceptron is more accurate than Naive Bayes with the highest accuracy ratio of 77.71% and 76.07%.

Full Text