Abstract

The C index is an internal cluster validity index that was introduced in 1970 as a way to define and identify a “best” crisp partition on n objects represented by either unlabeled feature vectors or dissimilarity matrix data. This index is often one of the better performers among the plethora of internal indices available for this task. This paper develops a soft generalization of the C index that can be used to evaluate sets of candidate partitions found by either fuzzy or probabilistic clustering algorithms. We define four generalizations based on relational transformations of the soft partition and, then, compare their performance to eight other popular internal fuzzy cluster indices using two methods of comparison (internal “best- c ” and internal/external (I/E) “best match”), six synthetic datasets, and six real-world labeled datasets. Our main conclusion is that the sum-min generalization is the second best performer in the best-c tests and the best performer in the I/E tests on small data.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call