Abstract

ABSTRACT This study presents the extreme gradient boosting (XGBoost) and random forest (RF) models to predict tourism demand by incorporating international COVID-19 cases, international tourist arrivals, and the destination's quarantine policy predictors. Unlike other ‘black box’ machine learning models, those two tree-based models offer better interpretability with explicit feature importance and tree structure representations. This paper evaluates the accuracy of these models in predicting international tourist arrivals in Indonesia during the COVID-19 pandemic using long-range (January 2008–June 2021) and short-range (January 2018–June 2021) training datasets. The performance of these two models is compared with benchmark models, such as the artificial neural network, autoregressive integrated moving average, and seasonal ARIMA models. In general, the tree-based machine learning models outperformed all benchmark models. International COVID-19 cases and tourist arrivals predictors have dominating feature importance scores in XGBoost models. Meanwhile, Google trends keywords on quarantine policies show significant importance in RF models but not in the XGBoost models. Moreover, RF models are better than the XGBoost models in terms of accuracy and overcoming overfitting cases.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.