Abstract

Over the years, ensemble algorithms became a common solution for supervised learning tasks as they show more robustness and usually are more accurate. But for unsupervised learning and clustering in particular ensemble methods remain poorly researched. In this work we propose a group of ensemble clustering algorithms by exploiting existing ideas and improving them with modern hyperparameter tuning algorithms and quality measure choosing heuristics. Our proposed approach uses MASSCAH to choose base clustering algorithm and optimize its hyperparameters using SMAC. The proposed solution was experimentally compared to existing solutions and k-means algorithm, tuned by an expert. Experiments were conducted on 64 synthetic datasets from Gaussian distribution and on 20 real-world datasets. We used 4 cluster validity indices to evaluate the quality of resulting partitions. From the results of experiments we can conclude that the proposed solution performs comparable to k-means tuned by an expert and outperforms existing ensemble clustering solutions. Apart from that, the proposed solution can be used as an automatic solution for clustering.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call