Abstract

A sentiment analysis is needed to identify tourist preferences for products and services in a tourist destination. Therefore, this study uses the Cross-Industry Standard Procedure for Data Mining (CRISP-DM) method to classify visitor review data for Komodo Island and Rinca Island obtained from the Tripadvisor website based on negative and positive sentiments, then recommends the development of an information system that can optimize the products and services of tourist destinations on Komodo Island and Rinca Island. Meanwhile, the algorithms used are k-Nearest Neighbor (k-NN), Naïve Bayes Classifier (NBC), Support Vector Machine (SVM), and Decision Tree (DT). Based on the results of data classification using the word weighting method or Term Frequency Inverse Document Frequency (TF-IDF) through the sentiment extract operator, it can be known that the five words that most often appear in tourist reviews of Komodo Island are as follows: Komodo dragons (1894), dragons (1596), island (1492), tour (840), boat (774). Meanwhile, the five words that most often appear in tourist reviews of Rinca Island are as follows: komodo dragon (1042), dragons (962), island (882), rinca (606), and boat (372). In addition, the results of the classification of 564 Komodo Island tourist review data and 364 Rinca Island tourist review data using CRISP-DM-based k-NN, NBC, SVM, and DT algorithms, show that the Support Vector Machine (SVM) is the best-performing algorithm where the accuracy value is 99.69%, precision 100%, recall 99.39%, f-measure 99.69%, Area Under Curve (AUC) 100% and t-Test 0.958. Meanwhile, the results of processing tourist review data on Rinca Island products and services show that the Support Vector Machine (SVM) algorithm has the best performance with 100% accuracy, 100% precision, 100% recall, 100% f-measure, 100% Area Under Curve (AUC) and 0.964 t-Test. After comparing the performance of SVM before and after using the Synthetic Minority Oversampling Technique (SMOTE), it can be seen that the implementation of the algorithm becomes more optimal when using the SMOTE operator. Thus, SVM is a relevant algorithm used as a model for analyzing tourist sentiment on Komodo Island and Rinca Island based on CRISP-DM.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call