This study proposes to calculate new housing price indices through machine learning techniques. Our research is conducted focusing on the random forest and artificial neural network methodologies that proved excellence in existing real estate studies. We use micro-level real estate transaction data and housing characteristics information to train our models. As a result, the forecasting powers of the machine learning based models are found to be much superior in terms of explanatory powers and estimation performances compared to the hedonic methodology- based model. And the random forest model shows the best explanatory power and performance. Our results show that the housing price indices based on the machine learning models have greater volatility than currently used indices at the time of the housing price increase. Considering the limitation that the existing indices have a smoothing problem, our results can be interpreted that the new machine learning based indices reflect the market trend successfully.
Read full abstract