Abstract

With the increasing lake monitoring data, data-driven machine learning (ML) models might be able to capture the complex algal bloom dynamics that cannot be completely described in process-based (PB) models. We applied two ML models, Gradient Boost Regressor (GBR) and Long Short-Term Memory (LSTM) network, to predict algal blooms and seasonal changes in algal chlorophyll concentrations (Chl) in a mesotrophic lake. Three predictive workflows were tested, one based solely on available measurements, and the others applying a two-step approach, first estimating lake nutrients that have limited observations, and then predicting Chl using observed and pre-generated environmental factors. The third workflow was developed by using hydrodynamic data derived from a PB model as additional training features in the two-step ML approach. The performance of the ML models was superior to a PB model in predicting nutrients and Chl. The hybrid model further improved the prediction of the timing and magnitude of algal blooms. A data sparsity test based on shuffling the order of training and testing years showed the accuracy of ML models decreased with increasing sample interval, and model performance varied with training/testing year combinations.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call