Purpose: This study proposes to evaluate the effectiveness of Random Forest (RF) compared to Classification and Regression Trees (CART) in prediction of hotel star ratings. The objective is to identify the algorithm that provides the most reliable and accurate classification outcomes based on diverse hotel attributes in accordance with the standard categorization of star hotel categories. This is necessary due to the important role of accurate star ratings in guiding consumer choices and enhancing competitive positioning in the hospitality industry. Method: This study conducted a comprehensive dataset about Hotel in Banyumas Regency, including location, facilities, the size of rooms, type of rooms, price of rooms, and customer reviews, subjected to training through both RF and CART algorithms. Both algorithms are evaluated using accuracy, precision, recall, and F1 score. Additionally, both algorithms due to in the same preprocessing while performing hyperparameter tuning improve the efficacy of each model. Result: The results showed that RF achieved the best overall accuracy and robustness than CART across all tests conducted. Furthermore, RF also outperformed CART in classification effectiveness among classes, including enhanced precision and recall scores across multiple stars rating categories, signifying increased generalization and consistency in classification tasks. RF classifier consistently surpassed the CART classifier in terms of both accuracy and F1-score throughout all random states and test sizes, with a highest score of 0.9932 at a random state of 100 and a test size of 0.4. The most reliable results were obtained using RF with 42 random states and a test size of 0.2, resulting in an accuracy of 0.9909, precision of 1.0, recall of 1.0, and F1 score of 1.0. Simultaneously, CART shows values of 0.9818, 1.0, 1.0, and 1.0, respectively, while maintaining the same variation. This consistent performance, regardless of fluctuations, illustrates the robustness and suitability of RF for classification tasks compared to CART. Novelty: This study offered new insights about the implementation of machine learning about hotel star rating predictions using RF and CART algorithms. Also, the novelty of the collected hotel dataset used in this study. A detailed comparative analysis was also provided, contributing to the existing literature by showing the effectiveness of RF over CART for this specific application. Future studies could explore the integration of additional machine learning methods to further enhance prediction accuracy and operational efficiency in the hospitality industry.
Read full abstract