ABSTRACT Drinking water purity analysis is an essential framework that demands several real-world parameters to ensure the quality of water. So far, sensor-based analysis of water quality in specific environments is done concerning certain parameters including the PH level, hardness, TDS, etc. The outcome of such methods analyzes whether the environment provides potable water or not. Potable denotes the purified water that is free from all contaminations. This analysis gives an absolute solution whereas the demand for drinking water is a growing problem where the multiple-level estimations are essential to use the available water resources efficiently. In this article, we used a benchmark water quality assessment dataset for analysis. To perform a level assessment, we computed three major features namely correlation-entropy, dynamic scaling, and estimation levels, and annexed with the earlier feature vector. The assessment of the available data was performed using the statistical machine learning model that ensembles the random forest model and light gradient boost model (GBM). The probability of the ensemble model was done by the Kullback Libeler Divergence model. The proposed probabilistic model has achieved an accuracy of 96.8%, a sensitivity of 94.55%, and a specificity of 98.29%.
Read full abstract