Abstract
In the past few years, Safe Semi-Supervised Learning (S3L) has become an emerging research topic. A few studies have been investigated in the S3L field and obtained desired performance. However, these studies mainly focus on classification problems which cause less attention on clustering. Meanwhile, there is no study takes both risky labeled and unlabeled samples into consideration(e.g., mislabeled samples and outliers). Therefore, we propose a novel Safe Semi-Supervised clustering method to safely explore the labeled an unlabeled samples. Firstly, we apply an effective approach to compute Safe Degree (SD) by estimating local density and minimum distance of each labeled and unlabeled sample. If a sample has large local density and small minimum distance, it can be safe, and correspondingly SD should be high. Otherwise, the sample should be risky and SD is low. Then the SD is introduced into a model-based semi-supervised clustering method to reduce the negative influences of risky labeled and unlabeled samples. Additionally, we construct a graph-based regularization term to limit the outputs of risky labeled samples to be those of nearest unlabeled neighbors. In this case, it is expected to further reduce the harm of risky labeled samples. At the same time, an illustration on an artificial dataset is given to explain the usefulness of the defined SD. Finally, the results which conducted on ten UCI datasets show that our algorithm is effective enough to achieve good clustering performance.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.