Abstract
The popularity of crowdsourcing has recently brought about brand new opportunities for engaging human intelligence in the process of data analysis. Most existing works on crowdsourcing have developed sophisticated methods to utilize the crowd as a new kind of processor, a.k.a. Human Processor Units (HPU). In this paper, we propose a framework, called MaC, to combine the powers of both CPUs and HPUs. In order to build MaC, we need to tackle the following two challenges: (1) HIT Selection: Selecting the "right" HITs (Human Intelligent Tasks) can help reducing the uncertainty significantly and the results can converge quickly. Thus, we propose an entropy-based model to evaluate the informativeness of HITs. Furthermore, we find that selecting HITs has factorial complexity and the optimization function is non-linear, thus, we propose an efficient approximation algorithm with a bounded error. (2) Uncertainty Management: Crowdsourced answers can be inaccurate. To address this issue, we provide effective solutions in three common scenarios of crowdsourcing: (a) the answer and the confidence of each worker are available; (b) the confidence of each worker and the voting score for each HIT are available; (c) only the answer of each worker is available. To verify the effectiveness of the MaC framework, we built a hybrid Machine-Crowd system and tested it on three real-world applications - data fusion, information extraction and pattern recognition. The experimental results verified the effectiveness and the applicability of our framework.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.