Abstract

In many real-world applications, there only exist very few labeled samples, while a large number of unlabeled samples are available. Therefore, it is difficult for some traditional semi-supervised algorithms to generate the useful classifiers to evaluate the labeling confidence of unlabeled samples. In this paper, a new semi-supervised classification based on clustering ensembles named SSCCE is proposed. It takes advantages of clustering ensembles to generate multiple partitions for a given dataset, and then uses the clustering consistency index to determine the labeling confidence of unlabeled samples. The algorithm can overcome some defects about the traditional semi-supervised classification algorithms, and enhance the performance of the hypothesis trained on very few labeled samples by exploiting a large number of unlabeled samples. Experiments carried out on ten public data sets from UCI machine learning repository show that this method is effective and feasible.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call