A new consensus function based on dual-similarity measurements for clustering ensemble

Tahani Alqurashi,Wenjia Wang

doi:10.1109/dsaa.2015.7344797

Abstract

Clustering ensemble is an unsupervised learning method, which combines a number of partitions in order to produce a better clustering result. In this paper, we have proposed a clustering ensemble algorithm named Dual-Similarity Clustering Ensemble (DSCE). The core of our ensemble is a consensus function, consists of three stages. The first stage is to transform the initial clusters into a binary representation, and the second is to measure the similarity between initial clusters and merge the most similar ones. The third is to identify candidate clusters, which contain only certain objects, and calculate their quality. The final clustering result is produced by an iterative process assigning the uncertain objects to a cluster that has a minimum effect on its quality. The number of clusters in the final clustering result converges to a stable value from the generated member, in contrast to most existing methods that require the user to provide the number of clusters in advance. The Experimental results on real datasets indicate that our method is statistically significant better than other state-of-the-art clustering ensemble methods including CO and DICLENS algorithms.

Full Text