The approach of visitor sentiment analysis to Borobudur Temple tourist destinations in Indonesia can be classified using various algorithms to get optimal results. Good algorithm performance can be seen from the confusion matrix (accuracy, precision, recall) value, Area Under Curve (AUC) value, and Receiver Operating Characteristic (ROC). This study used the Naïve Bayes Classifier (NBC), Decision Tree (DT), and Support Vector Machine (SVM) algorithms against 3850 text data obtained from the Tripadvisor website, especially reviews of Borobudur Temple visitors. The method refers to the Cross-Industry Standard Process for Data Mining (CRISP-DM) for optimizing tourist destination products and services by paying attention to six stages: business understanding, data understanding, data preparation, modeling, evaluation, and deployment. The results of this study show that the results of NBC's algorithm performance evaluation can be seen to have a change in the confusion matrix value at the accuracy value from 98.73% to 95.6%, the precision value changed from 98.72% to 98.97%, the recall value also changed from 100% to 96.54%. In addition, the Area Under Curve (AUC) of NBC also changed from 0.500 (50%) to 0.693 (69.35%). In addition, the results of the DT algorithm performance evaluation showed a change in the confusion matrix value at the accuracy value from 97.55% to 94.40%, the precision value increased from 97.63% to 91.86%, the recall value also changed from 99.90% to 99.47%. The Area Under Curve (AUC) of DT value also changed from 0.591 (59.1%) to 0.932 (93.2%). The results of the SVM algorithm performance evaluation showed a change in the confusion matrix value at the accuracy value from 98.73% to 99.41%; the precision value changed from 98.72% to 100%, and the recall value also changed from 100% to 99.01%. The Area Under Curve (AUC) of the SVM value also changed from 0.961 (96.1%) to 1.00 (100%). In addition, the T-test results show that the SVM algorithm is more dominant compared to other algorithms, where the SVM algorithm T-test value is 0.994 compared to the DT algorithm T-test value of 0.944 and the NBC algorithm T-test value of 0.98. Based on the Receiver Operating Characteristic (ROC) value, it can be seen that the DT algorithm also shows good performance in addition to SVM. It indicates that in analyzing the sentiment of visitors to Borobudur Temple, the best-recommended algorithm is the Support Vector Machine
Read full abstract