Abstract

This study developed a model to predict concentrations of chlorophyll-a ([Chl-a]) as a proxy for algal population with data from multiple monitoring stations in the Han river basin, by using machine-learning predictive models, then analyzed the relationship between [Chl-a] and the input variables of the optimized model. Daily water quality and meteorological data from 2012 to 2020 were collected from the real-time water quality information system and the meteorological administration of Korea. To quantify model accuracy, the coefficient of determination, root mean square error, and mean absolute error were applied. Among random forest (RF), support vector machine, and artificial neural network, the RF with random dataset showed the highest accuracy. The RF was optimized when 78 trees were applied to the model. Input variables for the best RF model were total organic carbon (feature importance: 27%), total nitrogen (19%), pH (13%), water temperature (8%), total phosphorus (8%), electrical conductivity (7%), dissolved oxygen (6%), minimum air temperature (AT) (4%), mean AT (3%), and maximum AT (3%). The feature-importance analysis showed that total organic carbon was the most important variable to predict [Chl-a] in the Han river basin. Total nitrogen was a more important variable than total phosphorus.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.