Abstract

Set intersection is one of the most important operations for many applications such as Web search engines or database management systems. This paper describes our new algorithm to efficiently find set intersections with sorted arrays on modern processors with SIMD instructions and high branch misprediction penalties. Our algorithm efficiently exploits SIMD instructions and can drastically reduce branch mispredictions. Our algorithm extends a merge-based algorithm by reading multiple elements, instead of just one element, from each of two input arrays and compares all of the pairs of elements from the two arrays to find the elements with the same values. The key insight for our improvement is that we can reduce the number of costly hard-to-predict conditional branches by advancing a pointer by more than one element at a time. Although this algorithm increases the total number of comparisons, we can execute these comparisons more efficiently using the SIMD instructions and gain the benefits of the reduced branch misprediction overhead. Our algorithm is suitable to replace existing standard library functions, such as std::set_intersection in C++, thus accelerating many applications, because the algorithm is simple and requires no preprocessing to generate additional data structures. We implemented our algorithm on Xeon and POWER7+. The experimental results show our algorithm outperforms the std::set_intersection implementation delivered with gcc by up to 5.2x using SIMD instructions and by up to 2.1x even without using SIMD instructions for 32-bit and 64-bit integer datasets. Our SIMD algorithm also outperformed an existing algorithm that can leverage SIMD instructions.

Highlights

  • Set intersection, which selects common elements from two input sets, is one of the most important operations in many applications, including multi-word queries in Web search engines and join operations in database management systems

  • We studied the branch misprediction rate, the Cycles Per instruction (CPI), and the path length

  • This paper described our new highly efficient algorithm for set intersections on sorted arrays on modern processors

Read more

Summary

Introduction

Set intersection, which selects common elements from two input sets, is one of the most important operations in many applications, including multi-word queries in Web search engines and join operations in database management systems. In Web search engines the set intersection is heavily used for multi-word queries to find documents containing two or more keywords by intersecting the sorted lists of matching document IDs from the individual query words [1]. In such systems, the performance of the sorted set intersection often dominates the overall performance. Unlike most of the existing advanced techniques, we focus on improving the execution efficiency of the set intersection on the microarchitectures of today’s processors by reducing the branch mispredictions instead of reducing the number of comparisons. If the predicted direction of the branch does not match the actual outcome of the branch (branch misprediction), the hardware typically wastes more than ten CPU cycles because it needs to flush speculatively executed instructions and restart the execution from the fetch of the instruction for the actual branch direction

Objectives
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call