Abstract

Deep learning demands a large amount of annotated data, and the annotation task is often crowdsourced for economic efficiency. When the annotation task is delegated to non-experts, the dataset may contain data with inaccurate labels. Noisy labels not only yield classification models with sub-optimal performance, but may also impede their optimization dynamics. In this work, we propose exploiting the pattern recognition capacity of deep convolutional neural networks to filter out supposedly mislabeled cases while training. We suggest a training method that references softmax outputs to judge the correctness of the given labels. This approach achieved outstanding performance compared to the existing methods in various noise settings on a large-scale dataset (Kaggle 2015 Diabetic Retinopathy). Furthermore, we demonstrate a method mining positive cases from a pool of unlabeled images by exploiting the generalization ability. With this method, we won first place on the offsite validation dataset in pathological myopia classification challenge (PALM), achieving the AUROC of 0.9993 in the final submission. Source codes are publicly available.

Highlights

  • Deep learning (DL) has been applied to solving various tasks in manufacturing [1,2], surveillance [3,4], and healthcare [5,6,7]

  • We present a method for filtering noisy labels based on the classifier’s confidence and an active learning approach based on pseudo-labels to use publicly-available unlabeled data

  • The model trained in the presence of label noise using our filtration method performed comparably to when no synthetic noise was injected under various settings

Read more

Summary

Introduction

Deep learning (DL) has been applied to solving various tasks in manufacturing [1,2], surveillance [3,4], and healthcare [5,6,7]. In an attempt to reduce this cost, annotations are often crowdsourced or delegated to non-experts with limited supervision, not to mention human errors or systematic bias [10]. Both issues are prevalent in medical data [11,12], where experts may hold conflicting opinions on identical images [13]. Incorrect annotations impede training, enforcing the networks’ predictions to coincide with wrong annotations and resulting in undesirable forgetting of meaningful patterns Such issues can be remedied using regularization techniques designed to prevent overfitting or analyzing the structure underlying systemic noise [15,16,17,18]. Curriculum learning [19] has been shown to be helpful in handling noisy labels by ignoring suspicious labels during the training [20,21,22,23]

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.