ABSTRACTMachine learning algorithms are frequently used in mass valuation activities today. These algorithms are believed to perform differently depending on various factors such as different functions, model parameters, data quantity, and quality. Another factor that is often overlooked is location. In this study, the performance of a set of machine learning algorithms in predicting real estate values was evaluated from a spatial perspective. To highlight the spatial performance of the algorithms, a study area was chosen where the factors affecting value are spatially heterogeneous. Value predictions were made using Linear Regression, Random Forest, Support Vector Machine, Regression Trees, Gaussian Process Regression, Artificial Neural Network, and Least‐Squares Boosting algorithms. Bayesian Optimization was used to improve the performance of the algorithms. The performances of the algorithms were compared using R2, RMSE, MSE, MAPE, and MAE metrics. The Random Forest model showed the best performance according to these metrics. Spatial predictions were made in the GIS environment using the IDW spatial interpolation method based on test data. The spatial distribution of the predictions and actual housing values were shown by producing maps. The spatial distribution of the algorithms was compared both with the actual values and with each other. It was found that, aside from the Random Forest model, other models also performed quite well in some locations and even produced better results than the Random Forest model. This demonstrates that location affects the performance of ML algorithms and that performance may vary depending on location. This study is expected to contribute to many fields of science where ML algorithms are used, especially in real estate valuation, and to bring a new perspective to the literature.
Read full abstract