In coal mining areas, surface subsidence poses significant risks to human life and property. Fortunately, surface subsidence caused by coal mining can be monitored and predicted by using various methods, e.g., probability integral method and deep learning (DL) methods. Although DL methods show promise in predicting subsidence, they often lack accuracy due to insufficient consideration of spatial correlation and temporal nonlinearity. Considering this issue, we propose a novel DL-based approach for predicting mining surface subsidence. Our method employs K-means clustering to partition spatial data, allowing the application of a gate recurrent unit (GRU) model to capture nonlinear relationships in subsidence time series within each partition. Optimization using snake optimization (SO) further enhances model accuracy globally. Validation shows our method outperforms traditional Long Short-Term Memory (LSTM) and GRU models, achieving 99.1% of sample pixels with less than 8 mm absolute error.