Abstract

Compared with labeled data, unlabeled data are significantly easier to obtain. Currently, classification of unlabeled data is an open issue. In this paper a novel SVM-KNN classification methodology based on Semi-supervised learning is proposed, we consider the problem of using a large number of unlabeled data to boost performance of the classifier when only a small set of labeled examples is available. We use the few labeled data to train a weaker SVM classifier and make use of the boundary vectors to improve the weaker SVM iteratively by introducing KNN. Using KNN classifier doesn’t enlarge the number of training examples only, but also improves the quality of the new training examples which are transformed from the boundary vectors. Experiments on UCI data sets show that the proposed methodology can evidently improve the accuracy of the final SVM classifier by tuning the parameters and can reduce the cost of labeling unlabeled examples.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call