Abstract

A precise large-scale dataset is crucial for supervising the training of deep neural networks (DNNs) in image classification. However, manually annotating large-scale dataset is time-consuming, which limits the scalability of supervised training. On the other hand, it is relatively easier to obtain a small clean dataset as well as a vast amount of data with noisy labels, but training on a noisy dataset causes the performance of deep networks dropping dramatically. To overcome this problem, this work studies how to effectively and efficiently train deep works on the noisy large-scale dataset in conjunction with a small clean dataset. One problem with transfer learning from a small clean dataset is that the transfer learning technique risks over-fitting on the clean dataset due to more parameters than the training examples. Hence, we propose a new approach, called online easy example mining (OEEM) to train deep network on the entire noisy dataset. OEEM aims to select clean samples to guide the training without human annotation by estimating the confidence of the observed labels with the model prediction. However, the sample-selection bias in the OEEM can trap the model into a locally optimal value. Consequently, we propose a general framework called Mutual Calibration Training (MCT) against different noise levels and noise types using dual-models, which combines the idea of transfer learning and OEEM. Finally, we conduct experiments on synthetic as well as real-world dataset with different noise types and noise rates, respectively. And the results demonstrate effectiveness of our approach.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call