With the rapid development of deep learning in recent years, it has shown excellent performance in various image and video processing tasks. In addition, it also has a great role in promoting the spatio-temporal fusion of remote sensing images. The reconstructed image can give people a good visual experience. The invention relates to a remote sensing image fusion method based on a progressive cascade deep residual network and provides an end-to-end progressive cascade deep residual network model for remote sensing image fusion. The use of the MSE loss function may cause oversmoothing of the fused image, so a new joint loss function is defined to capture finer spatial information to improve the spatial resolution of the fused image. Resize-convolution is used to replace the transposed convolution to eliminate the checkerboard effect in the fused image caused by the transposed convolution. Through the experiments on the remote sensing image fusion simulation and real datasets of multiple satellites, the data results of the proposed algorithm are more than 5.25% better than those of the comparative algorithm in the average quantification. The calculation time and system resource occupation are also reduced, which has important theoretical significance and application value in the field of artificial intelligence and image processing. It will play a certain role in promoting the theoretical research and application of remote sensing image fusion.