Abstract

Knowledge distillation (KD), which aims at transferring the knowledge from a complex network (a teacher) to a simpler and smaller network (a student), has received considerable attention in recent years. Typically, most existing KD methods work on well-labeled data. Unfortunately, real-world data often inevitably involve noisy labels, thus leading to performance deterioration of these methods. In this article, we study a little-explored but important issue, i.e., KD with noisy labels. To this end, we propose a novel KD method, called ambiguity-guided mutual label refinery KD (AML-KD), to train the student model in the presence of noisy labels. Specifically, based on the pretrained teacher model, a two-stage label refinery framework is innovatively introduced to refine labels gradually. In the first stage, we perform label propagation (LP) with small-loss selection guided by the teacher model, improving the learning capability of the student model. In the second stage, we perform mutual LP between the teacher and student models in a mutual-benefit way. During the label refinery, an ambiguity-aware weight estimation (AWE) module is developed to address the problem of ambiguous samples, avoiding overfitting these samples. One distinct advantage of AML-KD is that it is capable of learning a high-accuracy and low-cost student model with label noise. The experimental results on synthetic and real-world noisy datasets show the effectiveness of our AML-KD against state-of-the-art KD methods and label noise learning (LNL) methods. Code is available at https://github.com/Runqing-forMost/ AML-KD.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call