Abstract
Many stability measures, such as Normalized Mutual Information NMI, have been proposed to validate a set of partitionings. It is highly possible that a set of partitionings may contain one or more high quality clusters but is still adjudged a bad cluster by a stability measure, and as a result, is completely neglected. Inspired by evaluation approaches measuring the efficacy of a set of partitionings, researchers have tried to define new measures for evaluating a cluster. Thus far, the measures defined for assessing a cluster are mostly based on the well-known NMI measure. The drawback of this commonly used approach is discussed in this paper, after which a new asymmetric criterion, called the Alizadeh--Parvin--Moshki--Minaei criterion APMM, is proposed to assess the association between a cluster and a set of partitionings. We show that the APMM criterion overcomes the deficiency in the conventional NMI measure. We also propose a clustering ensemble framework that incorporates the APMM's capabilities in order to find the best performing clusters. The framework uses Average APMM AAPMM as a fitness measure to select a number of clusters instead of using all of the results. Any cluster that satisfies a predefined threshold of the mentioned measure is selected to participate in an elite ensemble. To combine the chosen clusters, a co-association matrix-based consensus function by which the set of resultant partitionings are obtained is used. Because Evidence Accumulation Clustering EAC can not derive the co-association matrix from a subset of clusters appropriately, a new EAC-based method, called Extended EAC EEAC, is employed to construct the co-association matrix from the chosen subset of clusters. Empirical studies show that our proposed approach outperforms other cluster ensemble approaches.
Paper version not known (
Free)
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have