Abstract

Purpose: The aim of the study is to analyze some of the main machine learning algorithms using big data obtained by web mining and to test the performance of these algorithms to explain hotel room prices. Thus, it is the determination of the model that best explains the hotel room prices. Design/methodology/approach – Web mining/scraping method was used to obtain research data. The target website was scanned for about six months with the help of an algorithm, and the data obtained from 6558 accommodation facilities were used in analysis. The second part of the research consists of statistical analysis and comparison of machine learning algorithms. Python programming language was used for analysis and implementation of algorithms. Pandas, numpy libraries for data processing; seaborn, matplotlib for graphics and visualization; scikit-learn is used to run machine learning algorithms. After the analysis, a model was created by logistic regression, which was thought to be the most suitable for the data. Results: It is seen that the compared Random Forest and Decision Tree algorithms both explain the data set at a rate of approximately 99.89%, so the tree/branching has been successful. The KNN algorithm achieved the highest performance with a classification of three clusters at 62.12%. Logistic Regression, Stochastic Gradient Decent and Support Vector Machines using the linear classification method obtained the highest score with 39.13% logistic regression method. In the model created by logistic regression, the score given to the hotel by the guests, the rank of the hotel among other hotels in the region, the type of the hotel and the city in which it is located were found to be statistically significant (p <0.05). Discussion: As a result of the research, machine learning algorithms were compared using the hotel room price data that obtained from web mining/scraping. It was also found important information about the hotel room rates in Turkey. A regression model has revealed which of the 44 independent variables are significant in explaining the hotel room price.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call