Abstract

Objective: This paper discusses and compares the various clustering methods over Ill-structured datasets and the primary objective is to find the best clustering method and to fix the optimal number of clusters. Methods: The dataset used in this experiment has derived from the measures of sensors used in an urban waste water treatment plant. In this paper, clustering methods like hierarchical, K means and PAM have been compared and internal cluster validity indices like connectivity, Dunn index, and silhouette index have been used to validate the clusters and the optimization of clustering is expressed in terms of number of clusters. At the end, experiment is done by varying the number of clusters and optimal scores are calculated. Findings: Optimal score and optimal rank list are generated which reveals that the hierarchical clustering is the optimal clustering method. The optimum value of connectivity index should be minimum, silhouette should be maximum, dunn should be maximum. So by interpreting the results, the optimal number of clusters for the experimental dataset have been concluded as K=2 and the optimal method for clustering the given dataset is hierarchical. Applications: The experiment has been done over the dataset derived from the measures of sensors used in a urban waste water treatment plant.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call