Validity Index for Clustered Data in Non-negative Space

Soumita Modak

doi:10.1177/00080683231172377

Abstract

We propose a novel nonparametric cluster validity index which can be used to evaluate the unknown number of existing clusters prevailing a data set, to assess the quality of classification for a clustered set of data members, or to compare the clustering output obtained from different algorithms. Our efficient measure depends only on the observation-wise distances of the non-negative clustered data from their origin given in an arbitrary dimensional space. Its fast implementation makes it appealing for big data analysis, whereas the high-dimensional applicability widens its usefulness. Easy interpretation, simple algorithm, speedy computation and great performance, shown in terms of data study, establish our advised validity index as a strong cluster accuracy measure among the acknowledged ones from the literature. AMS subject classification: 62H30

Full Text