Abstract

Since the beginning of the industrial revolution in the nineteenth century water pollution has grown into a global transboundary problem that negatively impacts aquatic ecosystems a risk linked directly to sustainable economic development, human, aquatic, and wildlife health. However, one major impediment to monitoring lake water quality is a lack of data for the relevant monitoring parameters at the relevant temporal and spatial scales. Therefore, this study investigates sixteen major reservoirs on a national scale based on 2020 in-situ sampling and Sentinel-2 MSI satellite data. Machine and deep learning algorithms were employed to classify and predict various water quality classes using Sentinel-2 MSI pixel color values and spectral indices. Multivariate techniques such as PCA transformation and regularized MANOVA were used as input feature preprocessing for variable reconstruction to deal with collinearity. Consequently, this improved the model classification performances overall. The results highlight deep neural net (DNN) and XGBoost as the outstanding classifying algorithms yielding F1-score of 0.94 and 0.93; Recall of 0.93 and 0.92; ROC-AUC of 0.92 and 0.89 respectively. Nutrient pollution affects all of the investigated reservoirs as they are classified as either meso- or hypereutrophic, and the situation is deteriorating. This study does not only give a feasible solution for effective water quality observation, but also provides a unique data pipeline for basic creation of training data for biogeochemical parameters for studying lake water quality using an integrated Google Earth Engine (GEE) Python Application Programming Interface (API) run in Google Colaboratory notebook.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call