Application of multivariate statistical techniques for monitoring Melaka River water quality (2019–2023)
ABSTRACT This study critically explores spatial and temporal variations of water quality in Melaka River Basin from 2019 to 2023 with the application of advanced multivariate statistical techniques: HCA, DA and PCA. HCA has efficiently stratified the monitoring sites as high medium low pollution clusters that has revealed seasonally driven pollutant dispersion patterns particularly intensified during monsoon periods. DA validated these classes by having strong discrimination ability with R2 = 0.694 wherein as coliform, NH3N, COD and turbidity were found as critical indicators. PCA further identified the main pollution drivers as sediment load and microbial input under wet conditions, with organic enrichment and salinity stress during drier periods. More importantly, forward stepwise DA revealed that even fewer variables comprising only salinity, temperature, COD, and NH3N could still maintain high predictive power for practical use in simplified versions of cost-effective water monitoring strategies. Based on study findings, this multivariate demonstrates powerful tools towards breaking down complex environmental data sets to identify specific remediation activities to reveal seasonal and spatial pollution dynamics within a river system of such historical and ecological importance that advocate evidence-based water governance and targeted policy intervention.
- Research Article
1608
- 10.1016/j.watres.2004.06.011
- Sep 8, 2004
- Water Research
Multivariate statistical techniques for the evaluation of spatial and temporal variations in water quality of Gomti River (India)—a case study
- Research Article
87
- 10.4236/jep.2013.45055
- Jan 1, 2013
- Journal of Environmental Protection
Variations in water quality of River Ogun around the cattle market, Isheri along Lagos-Ibadan express road were evaluated using multivariate statistical techniques such as principal component analysis (PCA) and cluster analysis (CA) to analyze the similarities or dissimilarities among the sampling points so as to identify spatial and temporal variations in water quality and sources of contamination over time. Water quality data were generated from 8 sampling points during 6 year sampling periods (i.e., 2000, 2005, 2006, 2009, 2010, and 2011). The samples were analyzed for 14 physico-chemical parameters and heavy metals such as temperature, pH, total solids (TS), total dissolved solids (TDS), suspended solids (SS), oil and grease, dissolved oxygen (DO), chemical oxygen demand (COD), Cl-1, alkalinity, total hardness (TH), SO42-, NO3 -, PO43- and heavy metals (Cd, Cu, Fe, Ni, Pb, Mn, and Zn). Three zones were differentiated based on the cluster analysis results, and implied similar water quality features. Thus, the water quality around the site may be categorized as relatively less polluted, moderately polluted and highly polluted. The PCA assisted to extract and recognize the factors responsible for water quality variations over the years. The results showed that the index which changes the quality of the water differs. The natural, inorganic and organic parameters e.g., temperature, TS, and etc., were the most significant parameters contributing to the variations in the water quality over the years. This shows that a parameter that can be significant in contributing to water quality in one season may less or not be significant in another. This result may be used to reduce the number of samples analyzed both in space and time, without much loss of information. This will assist the decision makers in identifying priorities to improve water quality that has deteriorated due to pollution from various anthropogenic activities.
- Research Article
36
- 10.3390/ijerph18168268
- Aug 4, 2021
- International Journal of Environmental Research and Public Health
This study assessed spatial and temporal variations of water quality to identify and quantify possible pollution sources affecting the Yeongsan River using multivariate statistical techniques (MSTs) and water quality index (WQI) values. A 15 year dataset of 11 water quality variables was used, covering 16 monitoring sites. The nutrient regime, organic matter, suspended solids, ionic contents, algal growth, and total coliform bacteria (TCB) were affected by the summer monsoon and the construction of weirs. Regression analysis showed that the algal growth was more highly regulated by total phosphorus (TP; R2 = 0.37) than total nitrogen (TN, R2 = 0.25) and TN/TP (R2 = 0.01) ratios in the river after weir construction and indicated that the river is a P-limited system. After constructing the weirs, the mean TN/TP ratio in the river was about 40, meaning it is a P-limited system. Cluster analysis was used to classify the sampling sites into highly, moderately, and less polluted sites based on water quality features. Stepwise discriminant analysis showed that pH, dissolved oxygen (DO), TN, biological oxygen demand (BOD), chemical oxygen demand (COD), chlorophyll-a (CHL-a), and TCB are the spatially discriminating parameters, while pH, water temperature, DO, electrical conductivity, total suspended solids, and COD are the most significant for discriminating among the three seasons. The Pearson network analysis showed that nutrients flow with organic matter in the river, while CHL-a showed the highest correlation with COD (r = 0.85), followed by TP (r = 0.49) and TN (r = 0.49). Average WQI values ranged from 55 to 141, indicating poor to unsuitable water quality in the river. The Mann–Kendall test showed increasing trends in COD and CHL-a but decreasing trends for TP, TN, and BOD due to impoundment effects. The principal component analysis combined with factor analysis and positive matrix factorization (PMF) showed that two sewage treatment plants, agricultural activities, and livestock farming adversely impacted river water quality. The PMF model returned greater R2 values for BOD (0.92), COD (0.87), TP (0.93), TN (0.91), CHL-a (0.93), and TCB (0.83), indicating reliable apportionment results. Our results suggest that MSTs and WQI can be effectively used for the simple interpretation of large-scale datasets to determine pollution sources and their spatiotemporal variations. The outcomes of our study may aid policymakers in managing the Yeongsan River.
- Research Article
153
- 10.1016/j.envres.2007.10.008
- Dec 11, 2007
- Environmental Research
Land use effects in groundwater composition of an alluvial aquifer (Trussu River, Brazil) by multivariate techniques
- Research Article
29
- 10.2166/nh.2014.174
- Jan 28, 2014
- Hydrology Research
Water quality monitoring programs generate complex multidimensional data sets. In this study, multivariate statistical techniques were employed as an effective tool for the analysis and interpretation of these water quality data sets. Principal component analysis (PCA) and cluster analysis (CA) were applied to evaluate spatial and temporal variation of water quality in Río Tercero Reservoir (Argentina). Six sampling sites were surveyed each climatic season for 21 parameters during 2003–2010. The results revealed that PCA showed the existence of four significant principal components (PCs) which account for 96.7% of the total variance of the data set. The first PC was assigned to mineralization whereas the other PCs were built from variables indicative of pollution. Hierarchical CA grouped the six monitoring sites into three clusters and classified the different climatic seasons into two clusters based on similarities in water quality characteristics.
- Research Article
1
- 10.3233/ajw-160030
- Jul 18, 2016
- Asian Journal of Water, Environment and Pollution
Different multivariate statistical techniques were applied to interpret the temporal variations in water quality of Tung Dhab drain, Amritsar, India and further to identify water pollution sources. Data was collected seasonally for a period of two years (2012-2013) using 34 water quality parameters. The recorded values for variables like turbidity, total suspended solids, biochemical oxygen demand, chemical oxygen demand, oil & grease, nitrate as N, lead, chromium, nickel and zinc were much higher than the recommended permissible discharge limits into inland waters. Significant correlations were found in between different physicochemical parameters ( p ≤ 0.05; p ≤ 0.01). Cluster Analysis (CA) grouped four sampling seasons into two clusters. CA confirmed that the water quality of rainy season was different from other three seasons in terms of similarity and distance indices. Principal Component Analysis/Factor Analysis (PCA/FA) explained minerals, organic, agricultural and industrial pollutants responsible for deterioration of drain water quality. The present study will help environmental agencies to make and enforce decisions regarding improvement of water quality of Tung Dhab drain.
- Research Article
14
- 10.2166/wst.2015.592
- Nov 26, 2015
- Water Science and Technology
Urban wastewater treatment plant (WWTP) effluent as reclaimed water provides an alternative water resource for urban rivers and effluent will pose a significant influence on the water quality of rivers. The objective of this study was to investigate the spatial and temporal variations of water quality in XZ River, an artificial urban river in Shenzhen city, Guangdong Province, China, after receiving reclaimed water from WWTP effluent. The water samples were collected monthly at different sites of XZ River from April 2013 to September 2014. Multivariate statistical techniques and a box-plot were used to assess the variations of water quality and to identify the main pollution factor. The results showed the input of WWTP effluent could effectively increase dissolved oxygen, decrease turbidity, phosphorus load and organic pollution load of XZ River. However, total nitrogen and nitrate pollution loads were found to remain at higher levels after receiving reclaimed water, which might aggravate eutrophication status of XZ River. Organic pollution load exhibited the lowest value on the 750 m downstream of XZ River, while turbidity and nutrient load showed the lowest values on the 2,300 m downstream. There was a higher load of nitrogen and phosphorus pollution in the dry season and at the beginning of wet season.
- Research Article
12
- 10.3390/w14121829
- Jun 7, 2022
- Water
Humanity’s water needs are constantly increasing, however, under the action of humanity themselves, the reserves of this substance are, constantly, deteriorating in quantity and quality. It is, therefore, necessary to preserve the water reserves. However, any development of a hydrosystem’s quality conservation strategy is based on determining the chemical characteristics of its waters. Therefore, the objective of this study is to investigate the spatial and temporal variations of water quality in the Tiflet River, a watercourse in the northwest of Morocco, to estimate its degree of pollution and to determine its main sources of pollution. Thus, eight stations, distributed along the watercourse and positioned taking into account the potential sources of pollution, were fixed, and eleven physicochemical parameters were, seasonally, evaluated. Multivariate statistical techniques were used to assess variations in water quality and identify the main factors responsible for pollution. The results showed that wastewater discharges into the river can increase the water salinity, phosphorus load and organic pollution load of the river. The total loads of nitrogen and nitrate pollution were higher compared to the standard norms in the stations exposed to agricultural pollution and to the leaching of the watersheds, which could aggravate the eutrophication state of the river and stimulate the growth of aquatic vegetation. The organic pollution load recorded in the wet season is low, compared to that recorded in the dry season. Whereas, the nutrient load recorded during the dry season is low, compared to that recorded in the wet season. An overall pollution index was used, classifying surface waters from sub-clean to moderately polluted.
- Research Article
6
- 10.3390/w16010166
- Dec 31, 2023
- Water
Analyzing 165 data from five national control sites in Baiyangdian Lake, this study unveils its spatiotemporal pattern of water quality. Utilizing machine learning and multivariate statistical techniques, this study elucidates the effects of rainfall and human activities on the lake’s water quality. The results show that the main pollutants in Baiyangdian Lake are TN, TP, and IMN. Spatially, human activities are the main drivers of water quality, with the poorest quality observed in the surrounding village area. The temporal dynamics of water quality parameters exhibit three distinct patterns: Firstly, parameters predominantly influenced by point source pollution, like TN and NH4+-N, show lower concentrations during flood periods. Secondly, parameters affected by non-point source pollution, such as TP, show higher concentrations during flood periods. Thirdly, irregular variations were observed in pH, DO, and IMN. The evaluation of Baiyangdian Lake’s water quality based on the grey relationship analysis method indicates that its water quality is good, falling within Classes I and II. Time series analysis found that the dilution effect of rainfall and the scouring action of runoff dominate the temporal variation in water quality in Baiyangdian Lake. The major pollution sources were identified as domestic sewage, followed by agricultural non-point source pollution and the release of internal pollutants. Additionally, aquaculture emerged as a significant contributor to the Lake’s pollution. This research provides a scientific basis for controlling the continuous deterioration of Baiyangdian Lake’s water quality and restoring its ecological function.
- Research Article
111
- 10.1007/bf03326244
- Jun 1, 2011
- International Journal of Environmental Science & Technology
Water pollution has become a growing threat to human society and natural ecosystems in the recent decades. Assessment of seasonal changes in water quality is important for evaluating temporal variations of river pollution. In this study, seasonal variations of chemical characteristics of surface water for the Chehelchay watershed in northeast of Iran was investigated. Various multivariate statistical techniques, including multivariate analysis of variance, discriminant analysis, principal component analysis and factor analysis were applied to analyze river water quality data set containing 12 parameters recorded during 13 years within 1995–2008. The results showed that river water quality has significant seasonal changes. Discriminant analysis identified most important parameters contributing to seasonal variations of river water quality. The analysis rendered a dramatic data reduction using only five parameters: electrical conductivity, chloride, bicarbonate, sulfate and hardness, which correctly assigned 70.2 % of the observations to their respective seasonal groups. Principal component analysis / factor analysis assisted to recognize the factors or origins responsible for seasonal water quality variations. It was determined that in each season more than 80 % of the total variance is explained by three latent factors standing for salinity, weathering-related processes and alkalinity, respectively. Generally, the analysis of water quality data revealed that the Chehelchay River water chemistry is strongly affected by rock water interaction, hydrologic processes and anthropogenic activities. This study demonstrates the usefulness of multivariate statistical approaches for analysis and interpretation of water quality data, identification of pollution sources and understanding of temporal variations in water quality for effective river water quality management.
- Research Article
125
- 10.1029/2018wr023370
- Jan 1, 2019
- Water Resources Research
Understanding the factors that influence temporal variability in water quality is critical for designing water quality management strategies. In this study, we explore the key factors that affect temporal variability in stream water quality across multiple catchments using a Bayesian hierarchical model. We apply this model to a case study data set consisting of monthly water quality measurements obtained over a 20‐year period from 102 water quality monitoring sites in the state of Victoria (Southeast Australia). We investigate six water quality constituents: total suspended solids, total phosphorus, filterable reactive phosphorus, total Kjeldahl nitrogen, nitrate‐nitrite (NOx), and electrical conductivity. We find that same‐day streamflow has the greatest effect on water quality variability for all constituents. Additional important predictors include soil moisture, antecedent streamflow, vegetation cover, and water temperature. Overall, the models do not explain a large proportion of temporal variation in water quality, with Nash‐Sutcliffe coefficients lower than 0.49. However, when considering performance on a site‐by‐site basis, we see high model performance in some locations, with Nash‐Sutcliffe coefficients of up to 0.8 for NOx and electrical conductivity. The effect of the temporal predictors on water quality varies between sites, which should be explored further for potential spatial patterns in future studies. There is also potential for further extension of these temporal variability models into a predictive spatiotemporal model of riverine constituent concentrations, which will be a useful tool to inform decision making for catchment water quality management.
- Research Article
90
- 10.1016/j.jclepro.2018.04.121
- Apr 16, 2018
- Journal of Cleaner Production
A comparative assessment of Australia's Lower Lakes water quality under extreme drought and post-drought conditions using multivariate statistical techniques
- Research Article
93
- 10.2166/hydro.2008.008
- Jan 1, 2008
- Journal of Hydroinformatics
Multivariate statistical techniques, such as principal component analysis (PCA), factor analysis (FA) and discriminant analysis (DA), were applied for the evaluation of temporal/spatial variations and the interpretation of a large complex water quality dataset of the Mekong River using data sets generated during 6 years (1995–2000) of monitoring of 18 parameters (16,848 observations) at 13 different sites. The results of PCA/FA revealed that most of the variations are explained by dissolved mineral salts along the whole Mekong River and in individual stations. Discriminant analysis showed the best results for data reduction and pattern recognition during both spatial and temporal analysis. Spatial DA revealed 8 parameters (total suspended solids, calcium, sodium, alkalinity, chloride, iron, nitrate nitrogen, total phosphorus) and 12 parameters (total suspended solids, calcium, sodium, potassium, alkalinity, chloride, sulfate, iron, nitrate nitrogen, total phosphorus, silicon, dissolved oxygen) are responsible for significant variations between monitoring regions and countries, respectively. Temporal DA revealed 3 parameters (conductivity, alkalinity, nitrate nitrogen) between monitoring regions; 3 parameters (total suspended solids, conductivity, silicon) in midstream region; and 2 parameters (conductivity, silicon) in upstream, lower stream and delta region which are the most significant parameters to discriminate between the four different seasons (spring, summer, autumn, winter). Thus, this study illustrates the usefulness of principal component analysis, factor analysis and discriminant analysis for the analysis and interpretation of complex datasets and in water quality assessment, identification of pollution sources/factors, and understanding of temporal and spatial variations of water quality for effective river water quality management.
- Research Article
123
- 10.1016/j.jenvman.2006.12.007
- Jun 22, 2007
- Journal of Environmental Management
Cluster analysis and quality assessment of logged water at an irrigation project, eastern Saudi Arabia
- Research Article
11
- 10.3390/w14142200
- Jul 12, 2022
- Water
In this study, spatiotemporal fluctuations in surface water quality in Vinh Long province, Vietnam, were conducted using entropy weighting, water quality index (WQI), and multivariate statistical techniques, such as cluster analysis (CA), principal component analysis (PCA), and discriminant analysis (DA). The samples collected at 63 monitoring locations in March, June, and September were measured for 15 parameters. Compared to the Vietnamese standard, surface water was contaminated with organic matters, nutrients, microorganisms, and salinity. DA identified the most typical parameters (pH, turbidity, TSS, EC, DO, Cl−, E. coli, coliform) in distinguishing temporal variations in water quality with greater than 75% of the correction. CA group 63 sampling sites into 22 clusters representing different land use patterns. WQI determined the worst water quality was found in the agricultural areas. Based on the results of entropy weighting, EC, coliform, N-NH4+, BOD, N-NO3−, and Fe had significantly controlled surface water quality. Four principal components obtained from PCA explained 66.45% of the variance, suggesting the influences of geohydrological factors and anthropogenic activities, such as domestic, market area, agriculture, and industry. The findings of this study can provide useful information for authorities to evaluate the effectiveness of monitoring systems and plan for water quality management strategies.