Generalized Pair-Counting Similarity Measures for Clustering and Cluster Ensembles

Shaohong Zhang,Xiaofei Xing,Dongqing Xie,Ying Gao,Hau-San Wong,Zongbao Yang

doi:10.1109/access.2017.2741221

Shaohong Zhang, Xiaofei Xing + Show 4 more

Open Access

https://doi.org/10.1109/access.2017.2741221

Copy DOI

Journal: IEEE Access	Publication Date: Jan 1, 2017
Citations: 57	License type: cc-by-nc-nd

Affiliation: Guangzhou University, City University of Hong Kong

Abstract

In this paper, a number of pair-counting similarity measures associated with a general formulation of cluster ensemble are proposed. These measures are formulated based on our motivation to evaluate the consistency between an individual clustering solution and a cluster ensemble solution, or that between different cluster ensemble solutions, in a uniform manner. A number of criteria are proposed for the comparison of these generalized measures, from both the perspectives of theoretical analysis and experimental validation. We identify their different behaviors and their correlations in different scenarios of traditional clustering solutions and cluster ensembles, with the hope that the results of these studies could 1) serve as important criteria for the design and selection of evaluation measures for clustering solutions, and 2) provide explanations for ambiguous clustering results in related scenarios. Experiments with both synthetic and real data sets are conducted to verify our findings.

Full Text