The development of anthropic activities during the 20th century increased the nutrient fluxes in freshwater ecosystems, leading to the eutrophication phenomenon that most often promotes harmful algal blooms (HABs). Recent years have witnessed the regular and massive development of some filamentous algae or cyanobacteria in Lake Geneva. Consequently, important blooms could result in detrimental impacts on economic issues and human health. In this study, we tried to lay the foundation of an HAB forecast model to help scientists and local stakeholders with the present and future management of this peri-alpine lake. Our forecast strategy was based on pairing two machine learning models with a long-term database built over the past 34 years. We created HAB groups via a K-means model. Then, we introduced different lag times in the input of a random forest (RF) model, using a sliding window. Finally, we used a high-frequency dataset to compare the natural mechanisms with numerical interaction using individual conditional expectation plots.We demonstrate that some HAB events can be forecasted over a year scale. The information contained in the concentration data of the cyanobacteria was synthesized in the form of four intensity groups that directly depend on the P. rubescens concentration. The categorical transformation of these data allowed us to obtain a forecast with correlation coefficients that stayed above a threshold of 0.5 until one year for the counting cells and two years for the biovolume data. Moreover, we found that the RF model predicted the best P. rubescens abundance for water temperatures around 14°C. This result is consistent with the biological processes of the toxic cyanobacterium. In this study, we found that the coupling between K-means and RF models could help in forecasting the development of the bloom-forming P. rubescens in Lake Geneva. This methodology could create a numerical decision support tool, which should be a significant advantage for lake managers.
Read full abstract