Accurate prediction of regional terrestrial water storage change (TWSA) is of great significance for water resources planning and management, and early warning of extreme climate disasters. Aiming at the problem that the conventional methods on prediction of TWSA time series are difficult to be accurate, the six typical regions are selected in China as examples, including the upper reaches of the Yangtze River (UYR), the southwest region (SWR), the Liaohe River Basin (LRB), the North China Plain (NCP), the Qinghai-Tibet Plateau (QTP), and the Pearl River Basin (PRB). The mascon product from GRACE/GRACE-FO provided by CSR is used to extract TWSA time series in six typical areas. The improved Back Propagation (BP) neural network, Long Short-Term Memory (LSTM) neural network and the latest Bidirectional LSTM (BiLSTM-attention) neural network model based on attention mechanism are proposed to predict and analyze the regional TWSA. In the experiment, the selection of the optimal model parameters such as the number of hidden layer nodes and the number of hidden units of the neural network model is tested and analyzed in detail. Meanwhile, the model prediction results are compared with the traditional least squares method and random forest (RF) prediction method. The root mean square error (RMSE), determination coefficient (R2), Nash–Sutcliffe efficiency coefficient (NSE) and mean absolute percentage error (MAPE) were used to evaluate the accuracy of the predicted results. The results show that the improved BP, LSTM and Bi-LSTM-attention neural network models all achieve higher prediction accuracy in UYR and SWR areas. RMSE is less than 2.641 cm, R2 is as high as 0.8 or more, NSE is above 0.6, and MAPE is within 0.1. Compared with the least square method, the RMSE of the predicted results from three neural network decreased by 0.998 cm, 0.700 cm and 0.7563 on average, and the R2 increased by 81.75%, 69.89% and 72% on average. Compared with RFML method, the RMSE from three neural network is reduced by 0.601 cm, 0.316 cm and 0.360, and R2 is increased by 38.20%, 24.60% and 27.06% on average. NSE and RMSE are improved to varying degrees in the above regions. It shows that the improved BP, LSTM and BiLSTM-attention model used can effectively predict TWSA. The research methods and results in this paper can provide important reference for the rational utilization of regional water resources and disaster risk assessment.