Abstract
Database queries can contain multiple predicates. The optimization of conjunctive predicates is still vital to the overall performance of analytic data processing tasks. Prior work proposes several memory-efficient storage layouts, e.g., BitWeaving and ByteSlice, to significantly accelerate predicate evaluation, as circuit-level intra-cycle parallelism available in modern CPUs can be exploited such that the total number of instructions can be dramatically reduced. However, the performance potential of conjunctive predicates has not been harvested yet under such storage layouts as there is no accurate cost model to provide necessary insights that guide the optimization process. In this paper, we propose a hybrid empirical/analytical cost model (Understanding) to unveil the performance characteristics of such storage layouts when applying to predicate evaluation. Our cost model takes into account effect of non-linear factors, e.g., cache miss and branch misprediction, and easily applies to different CPUs. The main finding from our cost model is to distinguish high-cost instruction (which suffers from cache miss and/or branch misprediction) from low-cost instruction (which enjoys cache hit and correct branch prediction) in the context of predicate evaluation under these storage layouts. Guided by such a finding, we propose a simple execution scheme Hebe (Optimizing), which is order-oblivious while maintaining high performance. Hebe is attractive to the query optimizer (QO), as the QO does not need to go through a sampling process to decide the optimal evaluation order in advance. The intuition behind Hebe is to significantly reduce the number of high-cost instructions while keeping low-cost instructions unchanged. Our finding from Hebe sheds light on the importance of accurate cost model that guide us to derive an efficient execution scheme for query processing on modern CPUs.
Accepted Version
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: IEEE Transactions on Knowledge and Data Engineering
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.