Abstract
SummaryThe WAND processing strategy is a dynamic pruning algorithm designed for large scale Web search engines where fast response to queries is a critical service. The WAND is used to reduce the amount of computation by scoring only documents that may become part of the top‐k document results. In this paper, we present two parallel strategies for the WAND algorithm and compare their performance on GPUs. In our first strategy (named size‐based), the posting lists are evenly partitioned among thread blocks. Our second strategy (named range‐based) partitions the posting lists according to document identifier intervals; thus, partitions may have different sizes. We also propose three threshold sharing policies, named Local, Safe‐R, and Safe‐WR, which emulate the WAND algorithm global pruning technique. We evaluated our proposals with different amounts of work, from short to extra‐large queries, using single query processing and batch of queries. Results show that the size‐based strategy reports the highest speedups but at the cost of low quality of results. The range‐based algorithm retrievals the exact top‐k documents and maintains a good speedup. Moreover, both strategies are capable of scaling as the amount of work is increased. In addition, there is no significant difference in the performance of the three threshold sharing policies.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: Concurrency and Computation: Practice and Experience
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.