Inconsistency of PCA-based water quality index – Does it reflect the quality?

Biswanath Mahanty,Pema Lhamo,Naresh K Sahoo

doi:10.1016/j.scitotenv.2022.161353

Abstract

The formalization of a stable water quality index (WQI) from measured hydrogeochemical parameters is essential for the identification and classification of water resources. In the principal component analysis (PCA) based WQI approach, the parameter weight is derived using either PC loading or rotated factor loading from a large number of samples pooled for WQI measurement. The PCA-based approach is paradoxical, as the calculated WQI rating of a sample would rather be dependent on the size, and composition of the population. Though this issue is well anticipated, no attempt has been made to regularize or measure the extent of WQI disagreement. In the present study, the WQI of 106 groundwater samples analyzed for 12 different hydrochemical parameters were modelled using PC loading or rotated factor loading (referred to as PCQ-1, PCQ-2, respectively) approach. Analysis reveals PCQ-1 to be positively biased in 78 % of samples and rating disagreements were evident in 9.43 % of samples. WQI of the data set was estimated using repeated (1000) random non-overlapping 2 to 5-fold data partitioning (containing 21 to 83 samples in each fold) adopting either an in-sample (test set) or out-sample (train set) modelling approach. The mean of WQI deviations in repeated resampling from the reference (i.e., using the entire dataset) has been positive in most of the samples using the PCQ-1 model, irrespective of the fold partition size. The median root mean square deviation values of the data set increased with the number of fold partitioning for in-sample calibration for both PCQ-1 and PCQ-2 approaches. The exclusion of a single water quality parameter from the PCA model can cause up to a 60 % deviation of the WQI score in some water samples. The cross-validation and Monte Carlo resampling approach can serve as a framework to test the stability of PCA-based WQI.

Full Text