Abstract
The hash join algorithm family is one of the leading techniques for equi-join performance evaluation. OLAP systems borrow this line of research to efficiently implement foreign key joins between dimension tables and big fact tables. From data warehouse schema and workload feature perspective, the hash join algorithm can be further simplified with multidimensional mapping, and the foreign key join algorithms can be evaluated from multiple perspectives instead of single performance perspective. In this paper, we introduce the surrogate key index oriented foreign key join as schema-conscious and OLAP workload customized design foreign key join to comprehensively evaluate how state-of-the-art join algorithms perform in OLAP workloads. Our experiments and analysis gave the following insights: (1) customized foreign key join algorithm for OLAP workload can make join performance step forward than general-purpose hash joins; (2) each join algorithm shows strong and weak performance regions dominated by the cache locality ratio of input_size/cache_size with a fine-grained micro join benchmark; (3) the simple hardware-oblivious shared hash table join outperforms complex hardware-conscious radix partitioning hash join in most benchmark cases; (4) the customized foreign key join algorithm with surrogate key index simplified the algorithm complexity for hardware accelerators and make it easy to be implemented for different hardware accelerators. Overall, we argue that improving join performance is a systematic work opposite to merely hardware-conscious algorithm optimizations, and the OLAP domain knowledge enables surrogate key index to be effective for foreign key joins in data warehousing workloads for both CPU and hardware accelerators.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.