Abstract
Information in form of unstructured texts is increasing and becoming commonplace for its existence on the internet. This information is easily found and utilized by business people or companies through social media. One of them is Twitter. Twitter is ranked 6th as a social media that is widely accessed today. The use of Twitter has the disadvantage of unstructured and large data. Consequently, it is difficult for business people or companies to know opinion towards service with limited resources. To Make it easier for businesses know the public's sentiment for better service in the future, public sentiment on Twitter needs to be classified as positive, neutral, and negative. The Multiclass Support Vector Machine (SVM) method is a supervised learning classification method that handles three classes classification. This paper uses One Against All (OAA) approach as a method to determine the class. This paper contains the results of classifying OAA Multiclass SVM methods with five different weighting features unigram, bigram, trigram, unigram+ bigram, and word cloud for analyzing tweet data, finding the best accuracy and important feature when processed with large data. The highest accuracy is the unigram TF-IDF model combined with the OAA Multiclass SVM with gamma 0.7 is 80.59.
Highlights
Introduction neutral sentimentTo Make easier for businesses to Information in the form of unstructured text-based documents is increasing and becoming commonplace on the internet
This study aims to overcome the weaknesses of Term Frequency-Inverse Document Frequency (TF-IDF) in dealing with single terms
The results show that there is a 10% increase in accuracy compared to TF-IDF without collocation integration
Summary
To Make easier for businesses to Information in the form of unstructured text-based documents is increasing and becoming commonplace on the internet. This happens because of the increase in internet users every year[1]. This information is often found and utilized by businesses or companies through social media, one of them is Twitter. TF-IDF features can be adapted to the form complaints, questions, or suggestions on a given service of data with machine learning methods to select the best so, that it will be better in the future
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have