Learning by aggregating experts and filtering novices: a solution to crowdsourcing problems in bioinformatics

Ping Zhang,Weidan Cao,Zoran Obradovic

doi:10.1186/1471-2105-14-s12-s5

Abstract

BackgroundIn many biomedical applications, there is a need for developing classification models based on noisy annotations. Recently, various methods addressed this scenario by relaying on unreliable annotations obtained from multiple sources.ResultsWe proposed a probabilistic classification algorithm based on labels obtained by multiple noisy annotators. The new algorithm is capable of eliminating annotations provided by novice labellers and of providing a more accurate estimate of the ground truth by consensus labelling according to higher quality annotations. The approach is evaluated on text classification and prediction of protein disorder. Our study suggests that the higher levels of accuracy, effectiveness and performance can be achieved by the new method as compared to alternatives.ConclusionsThe proposed method is applicable for meta-learning from multiple existing classification models and noisy annotations obtained by humans. It is particularly beneficial when many annotations are obtained by novice labellers. In addition, the proposed method can provide further characterization of each annotator that can help in developing more accurate classifiers by identifying the most competent annotators for each data instance.

Highlights

In many biomedical applications, there is a need for developing classification models based on noisy annotations
In computer-aided diagnosis (CAD), many computer-aided image diagnosis systems [5,21,22,23,24] were built from labels assigned by multiple physicians who provide their estimations of the gold standard, which can only be obtained from dangerous surgical operations
A combination of noisy annotations obtained by humans and existing machine-based classification models were integrated

Summary

Introduction

There is a need for developing classification models based on noisy annotations. Various groups studied the problem of developing classification models based on examples annotated by multiple labellers. Manually labelled data is successfully used together with mathematical models to provide annotator-specific accuracy estimates based on multi-annotator agreement [19,20]. Valizadegan et al [25] developed a probabilistic approach for learning classification models from opinions provided by multiple doctors and applied the approach to Heparin Induced Thrombocytopenia (HIT) electronic health records (EHR). Meta predictors are typically developed relying on disorder/order labelled training datasets. These datasets contain a very small number of proteins which have not already been used for development of the component predictors. Here a meta predictor is constructed in a completely unsupervised process without use of confirmed disorder/order annotations [32]

Methods

Results

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: BMC Bioinformatics	Publication Date: Sep 1, 2013
Citations: 41	License type: CC BY 2.0

R Discovery Prime

R Discovery Prime

Learning by aggregating experts and filtering novices: a solution to crowdsourcing problems in bioinformatics

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Bioinformatics

Lead the way for us

Similar Papers

Object detection with noisy annotations in high-resolution remote sensing images using robust EfficientDet
Siqi Wei ... Junke Wang
-
Siqi Wei, et. al.Siqi Wei ... Junke Wang
12 Sep 2021
12 Sep 2021

Learning to Segment Skin Lesions from Noisy Annotations
Zahra Mirikharaji ... Yiqi Yan
-
Zahra Mirikharaji, et. al.Zahra Mirikharaji ... Yiqi Yan
01 Jan 2019
01 Jan 2019

Redundancy analysis of behavioral coding for couples therapy and improved estimation of behavior from noisy annotations
Md Nasir ... Shrikanth Narayanan
-
Md Nasir, et. al.Md Nasir ... Shrikanth Narayanan
01 Apr 2015
01 Apr 2015

Deep Neural Network-Based Noisy Pixel Estimation for Breast Ultrasound Segmentation
Songbai Jin ... Patrice Monkam
-
Songbai Jin, et. al.Songbai Jin ... Patrice Monkam
16 Oct 2022
16 Oct 2022

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Learning by aggregating experts and filtering novices: a solution to crowdsourcing problems in bioinformatics

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Bioinformatics