Abstract

The semi-supervised self-training method is one of the successful methodologies of semi-supervised classification and can train a classifier by exploiting both labeled data and unlabeled data. However, most of the self-training methods are limited by the distribution of initial labeled data, heavily rely on parameters and have the poor ability of prediction in the self-training process. To solve these problems, a novel self-training method based on density peaks and natural neighbors (STDPNaN) is proposed. In STDPNaN, an improved parameter-free density peaks clustering (DPCNaN) is firstly presented by introducing natural neighbors. The DPCNaN can reveal the real structure and distribution of data without any parameter, and then helps STDPNaN restore the real data space with the spherical or non-spherical distribution. Also, an ensemble classifier is employed to improve the predictive ability of STDPNaN in the self-training process. Intensive experiments show that (a) STDPNaN outperforms state-of-the-art methods in improving classification accuracy of k nearest neighbor, support vector machine and classification and regression tree; (b) STDPNaN also outperforms comparison methods without any restriction on the number of labeled data; (c) the running time of STDPNaN is acceptable.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.