DataSift: An Expressive and Accurate Crowd-Powered Search Toolkit

Aditya Parameswaran,Hector Garcia-Molina,Ming Han Teh,Jennifer Widom

doi:10.1609/hcomp.v1i1.13077

DataSift: An Expressive and Accurate Crowd-Powered Search Toolkit

Aditya Parameswaran, Hector Garcia-Molina + Show 2 more

Open Access

https://doi.org/10.1609/hcomp.v1i1.13077

Copy DOI

Journal: Proceedings of the AAAI Conference on Human Computation and Crowdsourcing	Publication Date: Nov 3, 2013
Citations: 8

Affiliation: Stanford University

#Traditional Information Retrieval Systems #Real Corpora + Show 8 more

Abstract
Full-Text PDF
Similar Papers

Abstract

Traditional information retrieval systems have limited functionality. For instance, they are not able to adequately support queries containing non-textual fragments such as images or videos, queries that are very long or ambiguous, or semantically-rich queries over non-textual corpora. In this paper, we present DataSift, an expressive and accurate crowd-powered search toolkit that can connect to any corpus. We provide a number of alternative configurations for DataSift using crowdsourced and automated components, and demonstrate gains of 2–3x on precision over traditional retrieval schemes using experiments on real corpora. We also present our results on determining suitable values for parameters in those configurations, along with a number of interesting insights learned along the way.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Similar Papers

Paper Title

Journal

Date

Author

View more papers

More From: Proceedings of the AAAI Conference on Human Computation and Crowdsourcing

Paper Title

Journal

Date

Author

View more papers

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.