Abstract

In crowdsourcing scenarios, each instance obtains multiple noisy labels from different crowd workers and then gets its integrated label via a label aggregation method. In spite of the effectiveness of label aggregation methods, a certain level of noise still remains in the integrated labels. To solve this problem, several noise correction methods have been proposed in recent years. However, to our best knowledge, these methods seldom take the label confidence of each instance into account. Therefore, we propose a label confidence-based noise correction (LCNC) method. At first, LCNC calculates the label confidence of each instance using its multiple noisy labels to filter all instances and obtains an original clean set and noise set. Second, LCNC builds multiple random trees on the original clean set to recalculate the label confidence of each instance using its predicted labels generated by these random trees and then refilters all instances to obtain a new clean set and noise set. Finally, LCNC builds two heterogeneous classifiers on the new clean set to correct the noise instances in the new noise set according to a consensus voting strategy. The experimental results on 34 simulated and two real-world crowdsourced datasets show that LCNC significantly outperforms all the other state-of-the-art noise correction methods used for comparison.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call