A Survey on Internal Validity Measure for Cluster Validation

L Jegatha Deborah,A Kannan,R Baskaran

doi:10.5121/ijcses.2010.1207

Abstract

Data Clustering is a technique of finding similar characteristics among the data set which are always hidden in nature and grouping them into groups, called as clusters. Different clustering algorithms exhibit different results, since they are very sensitive to the characteristics of original data set especially noise and dimension. The quality of such clustering process determines the purity of cluster and hence it is very important to evaluate the results of the clustering algorithm. Due to this, Cluster validation activity had been a major and challenging task. The major factor which influences cluster validation is the internal cluster validity measure of choosing the optimal number of clusters. The main objective of this article is to present a detailed description of the mathematical working of few cluster validity indices and not all, to classify these indices and to explore the ideas for the future promotion of the work in the domain of cluster validation. In addition to this, a maximization objective function is defined assuming to provide a cluster validation activity.

Full Text