In this paper, we propose a new ensemble residual network model for short-term load forecasting (STLF). This model improves the accuracy of short-term load forecasting (24 hours in advance). The model has a two-stage network structure. First, the different fully-connected layers are combined, and the combined structure is similar to a recurrent neural network (RNN). Features obtained from historical load data are input to the first stage of the model to get preliminary prediction results. The second stage of the model is a modified residual network, and the final predictions are output from here. We use the ensemble snapshot model with learning rate decay to improve the generalization capability of the model. The model proposed in this paper was trained and tested on two public datasets. Numerical testing shows that the proposed model can get better forecasting results in comparison with other methods, and the ensemble method adopted effectively improves the generalization ability of the model.