Abstract
Crowdsourcing offers an efficient way to obtain a multiple noisy label set of each instance from different crowd workers and then label integration algorithms are used to infer its integrated label. In spite of the effectiveness of label integration algorithms, there always exists a certain degree of noise in the integrated labels, and thus noise correction algorithms have been proposed to reduce the effect of noise. However, existing noise correction algorithms seldom consider the effect of instance difficulty on noise correction. In this paper, we argue that the greater the difficulty of an instance, the fewer crowd workers can label it correctly, and the more likely the instance is a noise instance. Based on this premise, we propose a simple but very effective noise correction algorithm called instance difficulty-based noise correction (IDNC). In IDNC, we at first propose two methods to measure the difficulty of each instance. Then, we use the proposed two methods to filter the noise instances to obtain a clean set and a noise set. Finally, we build two different classifiers on the clean set to correct the noise instances in the noise set via the consensus voting. The extensive experiments on both simulated and real-world crowdsourced datasets validated the effectiveness and efficiency of our proposed IDNC.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.