Abstract

PurposeThis research presents machine learning models for predicting international tourist arrivals in Indonesia during the COVID-19 pandemic using multisource Internet data.Design/methodology/approachTo develop the prediction models, this research utilizes multisource Internet data from TripAdvisor travel forum and Google Trends. Temporal factors, posts and comments, search queries index and previous tourist arrivals records are set as predictors. Four sets of predictors and three distinct data compositions were utilized for training the machine learning models, namely artificial neural networks (ANNs), support vector regression (SVR) and random forest (RF). To evaluate the models, this research uses three accuracy metrics, namely root mean square error (RMSE), mean absolute error (MAE) and mean absolute percentage error (MAPE).FindingsPrediction models trained using multisource Internet data predictors have better accuracy than those trained using single-source Internet data or other predictors. In addition, using more training sets that cover the phenomenon of interest, such as COVID-19, will enhance the prediction model's learning process and accuracy. The experiments show that the RF models have better prediction accuracy than the ANN and SVR models.Originality/valueFirst, this study pioneers the practice of a multisource Internet data approach in predicting tourist arrivals amid the unprecedented COVID-19 pandemic. Second, the use of multisource Internet data to improve prediction performance is validated with real empirical data. Finally, this is one of the few papers to provide perspectives on the current dynamics of Indonesia's tourism demand.

Highlights

  • The increasing use of web-based platforms stimulates the growing availability of structured and unstructured data (Li et al, 2021)

  • The results show the positive impact of combining multisource Internet data to j j VOL. ▪▪▪ NO. ▪▪▪ JOURNAL OF TOURISM FUTURES PAGE 13

  • Prediction models utilizing a combination of predictors from an online travel forum and a search engine have better accuracy than those using the predictor from a single source of Internet data, either the online travel forum only or search queries only

Read more

Summary

Introduction

The increasing use of web-based platforms stimulates the growing availability of structured and unstructured data (Li et al, 2021). Search engines (Bangwayo-Skeete and Skeete, 2015), online forums (Fronzetti Colladon et al, 2019) and photo sharing apps (Miah et al, 2017) are just a handful of applications that contribute to the increasing availability of online data. The availability of online data has attracted academics and practitioners to extract business values from it. The data emission from this online platform provides valuable customer behavior information (Bangwayo-Skeete and Skeete, 2015; Li et al, 2017). Forecasting models have been one of the most popular use cases that can be improved by utilizing this big Internet data (Song et al, 2019)

Objectives
Methods
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.