Abstract

Recently crowdsourcing services are often used to collect a large amount of labeled data for machine learning, since they provide us an easy way to get labels at very low cost and in a short period. The use of crowdsourcing has introduced a new challenge in machine learning, that is, coping with the variable quality of crowd-generated data. Although there have been many recent attempts to address the quality problem of multiple workers, only a few of the existing methods consider the problem of learning classifiers directly from such noisy data. All these methods modeled the true labels as latent variables, which resulted in non-convex optimization problems. In this paper, we propose a convex optimization formulation for learning from crowds without estimating the true labels by introducing personal models of the individual crowd workers. We also devise an efficient iterative method for solving the convex optimization problems by exploiting conditional independence structures in multiple classifiers. We evaluate the proposed method against three competing methods on synthetic data sets and a real crowdsourced data set and demonstrate that the proposed method outperforms the other three methods.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call