Water quality degradation poses a significant challenge globally, especially in developing nations like Sri Lanka. Extensive monitoring programs designed to address escalating river pollution collect multiple water quality parameters over extended periods and varied locations. However, the sheer volume of data can be overwhelming, making it difficult to process effectively and interpret accurately using conventional methods. In this study, latent variable (LV) and unsupervised machine learning techniques were used to investigate spatial and seasonal variations of surface water quality for 17 parameters across 17 locations along the Kelani River, Sri Lanka, using monthly water quality parameters from 2016 to 2020. Pearson's correlation matrix identified 10 parameters significantly affecting water quality variations and factor analysis (FA) generated five LVs, accounting for 77% of the total variance in the dataset. The identified LVs showed multiple methods of river pollution.Hierarchical clustering analysis and self-organizing mapping methods clustered stations in a closely analogous manner. Stations near industrial zones and the river mouth showed higher water quality variance, often exceeding national guidelines. Correlation testing revealed strong relationships between water quality and catchment hydrometeorological variations during monsoonal seasons. Spatial analyses showed increased LV variance in the Lower Kelani River Basin, indicating higher pollutant levels in different seasons. Industrial effluents (LV-2 and LV-4) and domestic and municipal sewage (LV-3 and LV-5) exhibit greater seasonal fluctuations. The results showed that the proposed LV approach has the potential to assist authorities in addressing water pollution amidst the complexity of multiple water quality parameters.
Read full abstract