Semi-supervised Clustering Ensemble Based on Collaborative Training

Jinyuan Zhang,Feifei Huang,Amjad Mahmood,Yan Yang,Hongjun Wang

doi:10.1007/978-3-642-31900-6_55

Abstract

AbstractRecent researches on data clustering is increasingly focusing on combining multiple data partitions as a way to improve the robustness of clustering solutions. Most of them focused on crisp clustering combination. Semi-supervised clustering uses a small amount of labeled data to aid and bias the clustering of unlabeled data. However, in this paper, we offer a semi-supervised clustering ensemble model based on collaborative training (SCET) and an unsupervised clustering ensemble mode based on collaborative training (UCET). In the ensemble step of SCET, semi-supervised learning is introduced. While in UCET, the knowledge used in SCET is replaced by information extracted from the base-clusterings. Then tri-training is used as consensus of clustering ensemble. The experiments on datasets from UCI machine learning repository indicate that the model improves the accuracy of clustering.KeywordsSemi-supervised clustering ensemble modelcollaborative trainingsemi-supervised learningclustering ensemble

Full Text