AbstractEliminating the systematic errors of member models is a key step before the variant‐weight multi‐model ensemble forecasting. However, how to reasonably calculate model systematic errors is a problem worthy study. Taking surface temperature as the subject investigated, this study explores this problem through comparative analyses of a series of variant‐weight multi‐model ensemble forecasting experiments. The results showed that eliminating the systematic errors of member models dramatically improves the ensemble forecasting outcomes. The calculation of model systematic errors based on the optimal time lengths regarding the spatiotemporal distribution characteristics of model forecasting errors is a reasonable and effective approach. The surface temperature forecasts with model systematic errors calculated with the optimal time lengths and those beyond the optimal time lengths exhibit higher accuracies. Owing to the approximately uniform distribution of temperature, no significant difference happens between the accuracies of the surface temperature forecasts with model systematic errors calculated using the average and quantile methods. In addition, the updating of the models participating into the ensemble forecasting would weaken the representativeness of the optimal time lengths for model systematic errors calculation that are obtained based on historical period data, and thus, weaken the forecasting skill. The research results provide some train of thought for enhancing the variant‐weight multi‐model ensemble forecasting accuracy.