Space-Time Image Velocimetry (STIV) is considered to be a highly promising method for river flow measurement due to its safe and efficient features. However, the traditional STIV method is challenged to obtain stable and accurate results when disturbed by external environments such as obstacles, waves and noise. As a consequence, a residual network-based STIV method is constructed by combining STIV with deep learning. In the synthetic dataset validation, benefited from the unique residual module and the relation-aware global attention mechanism, the recognition accuracy of STI texture angle is as high as 99.9 %, which is much higher than that of the Efficient Net b3, ViT16 and VGG19 networks. In the real dataset, the proposed method shows stronger stability and lower error by compared with gradient tensor (GT) and fast Fourier transform (FFT) methods, whose overall recognition accuracy is above 90 % under all types of STI cases, angle error is within 2°, and flow velocity error is around 0.10 m/s. Furthermore, the model was validated by a one-year comparative measurement at the Qilijie hydrological station in 2022, and the results were compared with LSPIV, GT, and FFT methods. The R2 of the model to the measured values is 0.98, the RMSE is 0.17, and the MAE is 0.12 m/s, indicating that the method has sufficient accuracy and robustness for long-term stable and real-time monitoring of river flow.