Abstract

This paper is regarding the comparison of two techniques; Clustering Large Applications (CLARA) clustering and K-Means clustering using popular Iris dataset. CLARA clustering and K-Means clustering are the two techniques of “partitioning based” clustering. One considers medoids using some random sample data to form a cluster whereas the other considers centroid (means) of the dataset to form a cluster. In this paper, Cluster plot, Silhouette plot and Dunn Index on Iris dataset are shown for both the techniques. These all are used for “cluster validation”. The “Silhouette Analysis” is the measurement of an approximated average distance among the clusters. The “Silhouette plot” is the measurement of the closeness of the points in one cluster to the neighboring clusters, whereas the other internal clustering validation measure is the DUNN Index; higher the “Dunn Index” better is the clustering. All these statistical analysis is done in R programming. The final outcome attains that the CLARA clustering stands better than the K-Means clustering.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call