Abstract

Due to DNNs’ memorization effect, label noise lessens the performance of the web-supervised fine-grained visual categorization task. Previous literature primarily relies on small-loss instances for subsequent training. The current state-of-the-art approach JoCoR additionally employs explicit consistency constraints to make clean samples more confident. However, a joint loss designed for both sample selection criteria and parameter updating is not competent for training a robust model in the presence of web noise. Especially, false positives are assigned with larger weights, causing the model to pay more attention to misclassified noisy images. Besides, leveraging weight decay to forget discarded noisy instances is too slow and implicit to take effect. Therefore, we propose a simple yet effective approach named MS-DeJOR (Multi-Scale training with Decoupled Joint Optimization and Refurbishment). In contrast to JoCoR, we decouple sample selection from training procedure to handle the above problems. Specifically, a negative entropy term is applied to prevent false positives from being overemphasized. The model can explicitly forget those samples identified as noise by imposing such a regularization term on all training data. Furthermore, we use accumulated predictions to refurbish the noisy labels and re-weight training images to boost the model performance. A multi-scale feature enhancement module is adopted to extract discriminative and subtle feature representations. Extensive experiments show that MS-DeJOR yields state-of-the-art performances on three web-supervised fine-grained datasets, demonstrating the effectiveness of our approach. The data and source code have been available at https://github.com/msdejor/MS-DeJOR.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call