Abstract

Clustering ensemble has become a very popular technique in the past few years due to its potentialities for improving the clustering results. Roughly speaking it consists in the combination of different partitions of the same set of objects in order to obtain a consensus one. A common way of defining the consensus partition is as the solution of the median partition problem. In this way, the consensus partition is defined as the solution of a complex optimization problem. In this paper, we study possible prunes of the search space for this optimization problem. Particularly, we introduce a new prune that allows a dramatic reduction of the search space. We also provide a characterization of the family of dissimilarity measures that can be used to take advantage of this prune and we present two measures that fit into this family. We carry out an experimental study on synthetic data by comparing, under different circumstances, the size of the original search space and the size after the proposed prunes. Outstanding reductions are obtained, which can be beneficial for the development of clustering ensemble algorithms. We also compare, on real data, the behavior of a simulated annealing-based ensemble algorithm in the original partition space and in the two proposed pruned spaces. In all cases, the proposed prunes allow the algorithm to find solutions closer to the theoretical optimum.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.