Abstract

The problem of classifying the sentiment analysis was found that there were many features for the sentiments that caused the less accuracy in classifying the sentiment, for example the negative class for those sentiments. The purpose of this study was to Figure out the amount of suitable features of each class for learning data with applied integration of information gain (IG) technique which used to reduce the factor and integrated to synthetic minority over-sampling technique (SMOTE) in order to adjust the imbalanced class. In this study, it enhanced the efficiency of accuracy in every class, and then it was evaluated by four methods consisting of J48, Naive Bayes, k-Nearest Neighbor where k=1, 2, 3, and Support Vector Machine (SVM) to compare the efficiency of accuracy. The TP Rate was employed as the evaluation metric for the accuracy of each class including the positive and the negative whereas the efficiency of accuracy was the TP Rate of Positive and the TP Rate of Negative. As the results of this study, it revealed that the IG and the SMOTE suggested the number of suitable features for the sentiment analysis. SVM method given the higher efficiency of the accuracy, that obtained the TP Rate of Positive as 86.50 % and TP Rate of Negative as 89.10 % and the level of SMOTE suitable by 300 %.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call