Abstract

Crowdsourcing is widely utilized for collecting labeled examples to train supervised machine learning models, but the labels obtained from workers are considerably noisier than those from expert annotators. To address the noisy label issue, most researchers adopt the repeated labeling strategy, where multiple (redundant) labels are collected for each example and then aggregated. Although this improves the annotation quality, it decreases the amount of training data when the budget for crowdsourcing is limited, which is a negative factor in terms of the accuracy of the machine learning model to be trained. This paper empirically examines the extent to which repeated labeling contributes to the accuracy of machine learning models for image classification, named entity recognition and sentiment analysis under various conditions of budget and worker quality. We experimentally examined four hypotheses related to the effect of budget, worker quality, task difficulty, and redundancy on crowdsourcing. The results on image classification and named entity recognition supported all four hypotheses and suggested that repeated labeling almost always has a negative impact on machine learning when it comes to accuracy. Somewhat surprisingly, the results on sentiment analysis using pretrained models did not support the hypothesis which shows the possibility of remaining utilization of multiple-labeling.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.