Abstract

Recently, safe semi-supervised clustering (S3C) has become an emerging topic in machine learning field. S3C aims to reduce the performance degradation probability of wrong prior knowledge. The existing S3C methods tries to artificially estimate the risk or safety degrees through a predefined formula. The performance of S3C heavily depends on the quality of the formula. To alleviate the influence of the artificially defined formula, it is meaningful to develop an novel S3C version which can adaptively estimate the safety degrees. To achieve the goal, we propose an adaptive safety-aware semi-supervised fuzzy c-means algorithm (AS3FCM) which is used to deal with mislabeled instances. In order to adaptively estimate the safety degrees, AS3FCM employs a local consistency strategy to build a regularization term. Meanwhile, two regularization terms are constructed to constrain the outputs of the labeled instances. The constructed regularization terms are then embedded into the objective function of FCM. Finally, an alternating iterative optimization strategy is utilized to solve the optimization problem. In each iteration, the safety degrees can be adaptively computed and updated. To evaluate the performance of our algorithm, some experiments are conducted on toy and several benchmark datasets. As the mislabeled percentages change, AS3FCM can obtain the clustering accuracy from 81.23% to 82.87% on Heart and 86.84% to 88.12% on WDBC while the other S3C methods yield the accuracy from 80.09% to 82.41% on Heart and 86.34% to 87.94% on WDBC. It verifies that AS3FCM can reasonably estimate the safety degrees and achieve a safe exploration of the mislabeled instances.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.