Abstract

How to combine multiple clusterings into a single clustering solution of better quality is a critical problem in cluster ensemble. In this paper, we extend Strehl's consensus function based on information- theoretic principles and propose a novel weighted consensus function to combine multiple soft clusterings. In our consensus function, we use mutual information to measure the sharing information between two soft clusterings and emphasize the clustering which is much different from the others. We use the algorithm similar to sequential k-means to obtain the solution of this consensus function and conduct experiments on four real-world datasets to compare our algorithm with other four consensus function, including CSPA, HGPA, MCLA, QMI. The results indicate that our consensus function provides solutions of better quality than CSPA, HGPA, MCLA, QMI and when the distribution of diversity in cluster ensembles is uneven, considering the influence of diversity can improve the quality of clustering ensemble.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call