To address the difficulty of low prediction accuracy, insufficient model stability, and certain lag associated with a single machine learning model in the prediction of house price, this paper proposes a multimodel fusion house price prediction model based on stacking integrated learning. Firstly, web search data affecting house prices were collected by web crawler technology, and Spearman correlation analysis was performed on the attribute set to reduce its complexity and establish a prediction index system for four first‐tier cities in China. Secondly, with the goal of improving accuracy, diversity, and generalization ability, the types of base learners as well as metalearners are determined, and the parameters of the base learners are optimized using the grey wolf optimization algorithm to produce the GWO‐stacking model, and the experimental results from four datasets demonstrate that the model has high prediction accuracy. Finally, to solve the issue of unintelligible black boxes in machine learning models, we have used the state‐of‐the‐art interpretation method SHAP combined with the LightGBM algorithm to interpret the model, and the result can be used as a basis for real estate policy planning and adjustment and even guide the demand of home buyers, thus improving the efficiency and effectiveness of government policy making.
Read full abstract