Abstract

The resistance of computational models against label noise offers promising potential for the correction of erroneous labels. One intuitive way is to re-label the data samples based on the model's prediction when the original label error rate is relatively high. However, directly flipping labels to the model's prediction may not improve the quality of datasets due to the concentration of noisy labels at small regions, i.e., increasing noise condensity, in the process of label flipping. Given the same label accuracy, datasets with condensed noise lead to (much) worse learning models than those without condensed noise. Hence, the quality of dataset may not benefit from this correction process. Moreover, iteratively flipping the label typically leads to the decreasing of label accuracy rather than a stabilized error rate around which the error rate slowly oscillates at each subsequent iteration. In this paper, we propose a novel method that simultaneously reduces the label error rate and improves the quality of datasets (via reduction of noise condensity). In contrast to the existing methods that either involve humans in the label correction process or construct multiple models to obtain a consensus opinion, our proposed method is simple and can automatically improve the quality of datasets. Specifically, we propose the use of a small clean dataset to evaluate overfitting caused by the concentration of label noise. Once the noise condensity issue is detected, the label flipping process will be modified by adding a statistical probability into the flipping procedure. The proposed method is verified on noisy MNIST and CIFAR-10 datasets. Label correction results are presented and the prediction accuracies of the neural network model trained on the corrected datasets are compared with results from the methods that target for learning from noisy labels.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call