Abstract

Label noise in training data can significantly degrade a model’s generalization performance for supervised learning tasks. Here we focus on the problem that noisy labels are primarily caused by mislabeled confusing samples, which tend to be concentrated near decision boundaries rather than uniformly distributed, and whose features should be equivocal. To address the problem, we propose an ensemble learning method to correct noisy labels by exploiting the local structures of feature manifolds. Different from typical ensemble strategies that increase the prediction diversity among sub-models via certain loss terms, our method trains sub-models on disjoint subsets, each being a union of randomly selected seed samples’ nearest-neighbors of the same class on the data manifold. As a result, only a limited number of sub-models will be affected by locally-concentrated noisy labels, and each sub-model can learn a coarse representation of the data manifold along with a corresponding graph. The constructed graphs are used to suggest a set of label correction candidates, and accordingly, our method determines label correction results by majority decisions. Our experiments on real-world noisy label datasets demonstrate the superiority of the proposed method over existing state-of-the-arts.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.