Abstract

Training neural network classifiers (NNCs) usually requires all instances to be correctly labeled, which is difficult and/or expensive to satisfy in some practical applications. When label noise is present, mislabeled data will severely mislead the training of NNCs, resulting in poor generalization performance. In this work, we address the label noise issue by removing mislabeled instances from the training data. A COnsistence-based Mislabeled Instances REmoval (COMIRE) method is proposed. The main idea is based on the observation that during the training of the NNC, the training loss and the model's prediction uncertainty of correctly labeled instances show similar trends, while those of mislabeled instances have quite different trends. Thus, the consistency between the two trends can be used to distinguish correctly labeled instances from mislabeled ones. On this basis, an iteration scheme is introduced to further increase the separability between the two types of data. Experimental results show that COMIRE can effectively identify the mislabeled instances. Moreover, the classification performance is significantly improved after removing the identified instances from the noisy training data.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call