Abstract

The research problem addressed in this study is the analysis of public sentiment regarding over-tourism issues. Utilizing the Cross Industry Standard Process for Data Mining (CRISP-DM) methodology and the Naive Bayes Classifier (NBC) algorithm, the study navigates through stages of business understanding, data processing, modeling, evaluation, and deployment. The central focus lies in understanding and classifying public sentiments surrounding the challenges associated with over-tourism. The findings reveal that the NBC algorithm, particularly when augmented with Synthetic Minority Over-sampling Technique (SMOTE), demonstrates superior performance metrics, showcasing an accuracy of 84.82%, precision of 91.69%, recall of 76.75%, f-measure of 83.47%, and AUC of 0.838. The comparison with NBC without SMOTE, which registers an accuracy of 78.16%, precision of 87.61%, recall of 74.56%, f-measure of 80.51%, and AUC of 0.745, underscores the significance of addressing class imbalance for improved predictive performance. Integrating CRISP-DM with the NBC algorithm and SMOTE proves instrumental in advancing sentiment analysis methodologies, providing nuanced insights into public perceptions and attitudes concerning the critical issue of over-tourism.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call