Abstract

Self-training method is one of the relatively successful methodologies of semi-supervised classification. It can exploit both labeled data and unlabeled data to train a satisfactory supervised classifier. Mislabeling is one of the largest challenges in the self-training method and the most common technique for removing mislabeled samples is the local noise filter. However, existing local noise filters used in self-training methods confront following technical defects: parameter dependence and using only labeled data to remove mislabeled samples. To address these shortcomings, this paper proposes a novel self-training method based on density peaks and an extended parameter-free local noise filter (STDPNF). In STDPNF, the self-training method based on density peaks is redesigned to be more suitable for combination with local noise filters. Moreover, a new local noise filter based on natural neighbors is proposed to filter out mislabeled instances. Compared with existing local noise filters used in self-training methods, the one in STDPNF is parameter-free and can remove mislabeled samples by exploiting the information of both labeled data and unlabeled data. We focus on k nearest neighbor as a base classifier. In experiments, we verify the efficiency of STDPNF in improving the performance of the base classifier of k nearest neighbor and the advantage of STDPNF in having the ability to remove mislabeled instances efficiently even when labeled data are not sufficient.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.