Abstract

ABSTRACT This study presents the extreme gradient boosting (XGBoost) and random forest (RF) models to predict tourism demand by incorporating international COVID-19 cases, international tourist arrivals, and the destination's quarantine policy predictors. Unlike other ‘black box’ machine learning models, those two tree-based models offer better interpretability with explicit feature importance and tree structure representations. This paper evaluates the accuracy of these models in predicting international tourist arrivals in Indonesia during the COVID-19 pandemic using long-range (January 2008–June 2021) and short-range (January 2018–June 2021) training datasets. The performance of these two models is compared with benchmark models, such as the artificial neural network, autoregressive integrated moving average, and seasonal ARIMA models. In general, the tree-based machine learning models outperformed all benchmark models. International COVID-19 cases and tourist arrivals predictors have dominating feature importance scores in XGBoost models. Meanwhile, Google trends keywords on quarantine policies show significant importance in RF models but not in the XGBoost models. Moreover, RF models are better than the XGBoost models in terms of accuracy and overcoming overfitting cases.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call