Abstract

Evaluating a given clustering result is a very difficult problem in real world. Cluster validity indices are developed for this purpose. There are two different types of cluster validity indices available : External and Internal. External cluster validity indices utilize some supervised information and internal cluster validity indices utilize the intrinsic structure of the data. In this paper a new external cluster validity index, MMI has been implemented based on Max-Min distance among data points and prior information based on structure of the data. A new probabilistic approach has been implemented to find the correct correspondence between the true and obtained clustering. Genetic K-means algorithm (GAK-means) and single linkage have been used as the underlying clustering techniques. Results of the proposed index for identifying the appropriate number of clusters is shown for five artificial and two real-life data sets. GAK-means and single linkage clustering techniques are used as the underlying partitioning techniques with the number of clusters varied over a range. The MMI index is then used to determine the appropriate number of clusters. The performance of MMI is compared with existing external cluster validity indices, adjusted rand index (ARI) and rand index (RI). It works well for two class and multi class data sets.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call