Abstract

<p>Water is an essential elixir for several living organisms to function and survive. But it gets contaminated through several sources such as industrial wastes, oil spills, marine dumping, etc. With a growing population, availability of good quality water is of grave importance. This has become the motivation to probe into analysis of water quality from the outcomes of Statistical and Ensemble methods and to find the best working models from both methods. Research has been done to predict water quality analysis using standalone statistical and ensemble models. So, this research focuses on obtaining the best Statistical and Ensemble model separately among the models tried. The statistical models implemented for comparison are Principal Component Analysis (PCA), Hierarchical Clustering Analysis (HCA), Linear Discriminant Analysis (LDA), Quadratic Discriminant Analysis (QDA). The Ensemble models used are Bagging, Boosting and Stacking. The models are then combined to build a Hybrid model to observe the comparisons between the three. The performance metrics used are Confusion Matrix, Accuracy, Precision, Recall, F1-score and ROC curve. While comparing the models, it is observed that Hybrid model produces the most accurate results, hence proving that the combination of Statistical and Ensemble model is efficient.</p>

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call