AbstractAs a key water quality parameter, dissolved oxygen (DO) concentration, and particularly changes in bottom water DO is fundamental for understanding the biogeochemical processes in lake ecosystems. Based on two machine learning (ML) models, Gradient Boost Regressor (GBR) and long‐short‐term‐memory (LSTM) network, this study developed three ML model approaches: direct GBR; direct LSTM; and a 2‐step mixed ML model workflow combining both GBR and LSTM. They were used to simulate multi‐year surface and bottom DO concentrations in five lakes. All approaches were trained with readily available environmental data as predictors. Indices of lake thermal structure and mixing provided by a one‐dimensional (1‐D) hydrodynamic model were also included as predictors in the ML models. The advantages of each ML approach were not consistent for all the tested lakes, but the best one of them was defined that can estimate DO concentration with coefficient of determination (R2) up to 0.6–0.7 in each lake. All three approaches have normalized mean absolute error (NMAE) under 0.15. In a polymictic lake, the 2‐step mixed model workflow showed better representation of bottom DO concentrations, with a highest true positive rate (TPR) of hypolimnetic hypoxia detection of over 90%, while the other workflows resulted in, TPRs are around 50%. In most of the tested lakes, the predicted surface DO concentrations and variables indicating stratified conditions (i.e., Wedderburn number and the temperature difference between surface and bottom water) are essential for simulating bottom DO. The ML approaches showed promising results and could be used to support short‐ and long‐term water management plans.
Read full abstract