Settlement is one of the most critical safety evaluation indicators for concrete face rock-fill dams. Deformation measurement data exhibit spatiotemporal correlation due to similar environmental factors and material properties. However, the current research has focused mainly on single-point prediction models, which ignore the relationship between settlement values and various parts of dams. To address these issues, the spatiotemporal correlation of settlements across multiple measurement points is considered and a deep learning approach based on spatiotemporal feature fusion is proposed. First, a density peak clustering algorithm categorizes monitoring points characterized by analogous deformation patterns. Then, a deep learning model combining residual neural networks (ResNet) and gated recurrent unit neural networks (GRU), i.e. ResGRU, is established. In ResGRU, ResNet is used to map the input data’s features to more complex high-dimensional patterns through convolution and obtain deep nonlinear spatial features. GRU is used to capture the temporal dependencies of data to obtain temporal features. Finally, the spatiotemporal features are fused through fully connected layers to predict settlements. The model is validated and compared with deep learning models using monitoring data collected from an actual concrete face rock-fill dam. The proposed approach demonstrates superior performance with an average RMSE and MAE of 0.352 and 0.255, respectively, and an average R2 of 0.988. The results show that the ResGRU model generally outperforms the other models. This approach provides an efficient and convenient approach for the settlement analysis of concrete face rock-fill dams.