湖泊水位是维持其生态系统结构、功能和完整性的基础.鄱阳湖受流域五河和长江来水双重影响,水位变化复杂.为了准确预测鄱阳湖水位变化,采用长短时记忆神经网络方法(LSTM)构建了鄱阳湖水位预测模型.该模型以赣江、抚河、信江、饶河和修水五河入湖流量和长江干流流量作为输入条件,预测鄱阳湖湖区不同代表站(湖口、星子、都昌、吴城和康山)的水位过程.研究以1956-1980年的水文时间序列数据作为训练集,1981-2000年作为验证集,探讨了LSTM模型输入时间窗、隐藏神经元数目、初始学习率等模型参数对预测精度的影响,并确定了鄱阳湖水位预测模型的最优参数.结果表明,采用LSTM神经网络方法可基于流域五河和长江来水量历时数据合理预测鄱阳湖不同湖区的水位过程,五站水位预测的均方根误差为0.41~0.50 m,纳什效率系数和决定系数达0.96~0.98.为考察模型训练数据集对鄱阳湖水位预测结果的影响,进一步选取了随机5年(1956-1960年)的资料和5个典型水文年(1954年、1973年、1974年、1977年和1978年)的日均流量资料来训练模型.结果显示随机5年资料作为训练数据的预测精度要差于典型年水文资料训练得到的模型,尤其是洪、枯水位的预测;由于典型水文年数据量仍远低于20年的资料,故其总体预测精度要略低于采用20年资料训练的模型.建议应用这类基于数据驱动的模型时,应该尽可能多选取具有代表性的资料来训练.;Lake water level is the basis for maintaining the structure, function, and integrity of its ecosystem. The water level change of Lake Poyang is complicated as it was affected by five rivers within the basin and the Yangtze River. To accurately predict the water level change of Lake Poyang, the long short-term memory (LSTM) is used to construct the water level prediction model of Lake Poyang. The model uses the flows of the Ganjiang River, Fuhe River, Xinjiang River, Raohe River, Xiushui River and the mainstream of the Yangtze River as input conditions to predict the water level process of different representative stations in the Lake Poyang area (Hukou, Xingzi, Duchang, Wucheng and Kangshan). The hydrological time series data from 1956 to 1980 is used as the training set, and data from 1981 to 2000 was used as the verification set. The influence of model parameters such as input time window, hidden neuron nodes and initial learning rate on prediction accuracy is discussed. The optimal parameters of the Lake Poyang water level prediction model are determined. The results show that the LSTM can accurately predict the water level at different parts of Lake Poyang based on the water flow from the five rivers and the Yangtze River. The RMSE value of the five stations is 0.41-0.50 m, and the NSE and R<sup>2</sup> are 0.96-0.98. In order to investigate the impact of the model training set on the water level prediction results of Lake Poyang, the study further selects data from 5 random years (1956-1960) and 5 typical hydrological years (1954, 1973, 1974, 1977 and 1978) daily average flow data to train the model. The results show that the prediction accuracy of random 5 years data as training set is worse than that of typical annual hydrological data training, especially the prediction of flood and dry water level; since the typical hydrological data volume is still much lower than 20 years of data, the overall prediction accuracy is slightly lower than the model with 20 years of data training. Therefore, representative data should be selected as much as possible for training, when applying such a data-driven LSTM neural network model.