Abstract

Estimating the optimal number of clusters (NC) is pivotal in cluster analysis. From the viewpoint of sample geometry, a novel internal clustering validity index, which is termed the between-within cluster (BWC) index, is designed in this paper. Moreover, a method is proposed to estimate the optimal NC. The BWC index improves the well-known Silhouette index. BWC validates the clustering results from a certain clustering algorithm (e.g., affinity propagation or hierarchical) and estimates the optimal NC for many kinds of data sets, including synthetic data sets, benchmark data sets, UCI data sets, gene expression data sets, and images. Theoretical analysis and experimental studies demonstrate the effectiveness and high efficiency of the new index and method.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call