Accurate prediction of water quality contributes to the intelligent management of water resources. Water quality indices have time series characteristics and nonlinearity, but the existing models only focus on the forward time series when long short-term memory (LSTM) is introduced and do not consider the parallel computation on the model. Owing to this, a new neural network called LSTM-multihead attention (LMA) was constructed to predict water quality, using long short-term memory to process time series data and multihead attention for parallel computing and extracting feature information. Additionally, water quality indices have the issues of multiple data types and complex data correlations, as well as missing data and abnormal data problems in water quality data. In order to solve these problems, this study proposes a water quality prediction model called GRA-LMA-based linear interpolation, gray relational analysis and LMA. Two experiments are carried out to verify the predictive performance of the GRA-LMA with the water quality data of the Huaihe River Basin as a case study sample. The first experiment focuses on data processing, including the processing of missing data and abnormal data of water quality data, and the correlation analysis of water quality indices. Linear interpolation is adapted to process the missing data, while a combination of boxplot and histogram is adopted to analyze and eliminate the abnormal data, which is then repaired the abnormal data with linear interpolation. The gray relational analysis is adopted to calculate the correlation between different water quality indices, and water quality indices with high correlation are retained to determine the input variables of the water quality prediction model. The data processing results demonstrate that repairs can be made using linear interpolation without altering the pattern of data change and the model by using the gray relational analysis to reduce the quantity of data it needs as input. In the second experiment, the predictive capacity of GRA-LMA and existing models such as backpropagation neural network (BP), recurrent neural network (RNN), long short-term memory (LSTM), and gate recurrent unit (GRU) was evaluated and compared using different numerical and graphical performance evaluation metrics.Comparative experimental results show that the mean square error of pH, dissolved oxygen, chemical oxygen demand, ammonia nitrogen, electrical conductivity, turbidity, total phosphorus, and total nitrogen of GRA-LMA is reduced to 0.05890, 0.40196, 0.32454, 0.04368, 14.71003, 8.13252, 0.01558, and 0.14345. The results indicate that GRA-LMA has superior adaptability for predicting various water quality indices and can significantly lower the induced prediction error.