Abstract. Real estate price prediction plays a vital role in urban planning, investment decision-making, and risk management. However, existing prediction models often show problems such as insufficient generalization ability and susceptibility to outliers when faced with complex nonlinear relationships, multidimensional features, and noisy data. Therefore, choosing a model that can accurately capture complex patterns and has strong robustness has become the focus of research. This paper introduces the random forest model and compares it with multivariate linear regression, XGBoost, and support vector machine (SVM). Compared with the traditional regression model, the random forest model combines the flexibility of decision trees and the multi-level feature extraction ability of deep learning, and can better handle the complex nonlinear relationships in the Boston housing price dataset. The experimental results show that the random forest model has achieved excellent performance in all evaluation indicators, and the model accuracy indicators are distributed as MSE=8.2502, RMSE=2.8723, MAE=2.0668, and R^2=0.8875. These results show that the random forest model not only outperforms other models in prediction accuracy but also shows significant advantages in dealing with data complexity and improving generalization ability. Therefore, the random forest model provides an efficient and reliable tool for future real estate price prediction research and applications.
Read full abstract