Abstract

Recently, Safe Semi-Supervised Learning (S3L) has become an active topic in the Semi-Supervised Learning (SSL) field. In S3L, unlabeled data that may affect the performance of SSL both positively and negatively are exploited more safely through different risk-based strategies, and such S3L methods are expected to perform at least the same as the corresponding Supervised Learning (SL) methods. While the previously proposed S3L methods considered the risk of unlabeled data, they did not explicitly model the different risk degrees of unlabeled data on the learning procedure. Hence, we propose risk-based safe Laplacian Regularized Least Squares (RsLapRLS) by analyzing the different risk degrees of unlabeled data in this paper. Our motivation is that unlabeled data may be risky in SSL and the risk degrees are different. We assign different risk degrees to unlabeled data according to the different characteristics in supervised and semi-supervised learning. Then a risk-based tradeoff term between supervised and semi-supervised learning is integrated into the objective function of SSL. The role of risk degrees is to determine the way of exploiting the unlabeled data. Unlabeled data with large risk degrees should be exploited by SL and others by SSL. In particular, we employ Regularized Least Squares (RLS) and Laplacian RLS (LapRLS) for SL and SSL, respectively. Experimental results on several UCI and benchmark datasets show that the performance of our algorithm is never significantly inferior to RLS and LapRLS. In this way, our algorithm improves the practicability of SSL.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.