Abstract

Learning-from-crowds aims to design proper aggregation strategies to infer the unknown true labels from the noisy labels provided by ordinary web workers. This paper presents max-margin majority voting (M$^3$3V) to improve the discriminative ability of majority voting and further presents a Bayesian generalization to incorporate the flexibility of generative methods on modeling noisy observations with worker confusion matrices for different application settings. We first introduce the crowdsourcing margin of majority voting, then we formulate the joint learning as a regularized Bayesian inference (RegBayes) problem, where the posterior regularization is derived by maximizing the margin between the aggregated score of a potential true label and that of any alternative label. Our Bayesian model naturally covers the Dawid-Skene estimator and M$^3$3V as its two special cases. Due to the flexibility of our model, we extend it to handle crowdsourced labels with an ordinal structure with the main ideas about the crowdsourcing margin unchanged. Moreover, we consider an online learning-from-crowds setting where labels coming in a stream. Empirical results demonstrate that our methods are competitive, often achieving better results than state-of-the-art estimators.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.