Abstract

With the rapid expansion of big data in all fields of science and engineering, multi-label classification has become a more acute problem in real-world data sets, it is notably difficult to develop an efficient model by using traditional supervised learning method because of labeled samples being scarce. In this paper, we proposes a multi-label k-nearest neighbor classification method based on semi-supervised learning (SSML-kNN in short). SSML-kNN firstly proposed a semi-supervised self-training model,and multi-label k-nearest neighbor classification based on correlation degree is used to classify the unlabeled datasets. After that, the training intermediate results with high confidence level are selected and added to the training data set. Meanwhile the training model is continuously recycled to expand the labeled data set. Finally, the test set is classified by the trained model. Experiments on publicly emotions datasets and yeasts datasets demonstrate that SSML-kNN achieves more prominent results than other related approaches across various evaluation metrics.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call