With the significant increase in the proportion of volatile new energy in the power system in recent years, the difficulty of system scheduling has increased. Accurate load forecasting is an important prerequisite for flexible scheduling. The load itself is a highly regular object that is relatively easy to predict. However, steep changes in load can cause significant deviations in load forecasting. In response to this issue, this article first selects input variables that can help the model identify steep changes in load based on Pearson correlation coefficient and the proposed “Steep change impact rate”. Then, Conv2D-Gate Recurrent Unit (Conv2D-GRU) model is built to fully extract steep changes information from inputs and achieve day-ahead load forecasting. Naive persistence, Auto regressive (AR), Gradient boosting decision trees (GBDT), Convolutional neural network (CNN), Long short-term memory (LSTM) and Gate recurrent unit (GRU) are used for comparison. Compared to Naive persistence, the Conv2D-GRU-SC resulted in a decrease of 54.08 % in Mean absolute error (MAE), a decrease of 57.58 % in Root mean square error (RMSE) and an increase of 51.31 % in the R-Square (R2).