Abstract

The improvement of robustness and efficiency for multi-way equijoin query is challenging, no-matter for centralized database systems or distributed database systems. Due to lots of unnecessary data existing during query processing, these two metrics will be seriously reduced. If we can thoroughly prune unnecessary data in advance, the robustness and efficiency will be highly improved. However, the pruning power of current strategies, such as predicate push-down and algebraic equivalence, is limited. We present deepDP, a powerful, generalized, and efficient strategy for data pruning. deepDP builds multiple independent pruning spaces by generating longest transitive closures and applies appropriate data pruning strategy for each pruning space. For thoroughly pruning unnecessary data, deepDP employs <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"><tex-math notation="LaTeX">$\alpha \cdot \beta$</tex-math></inline-formula> pruning strategy to clean each pruning space based on a newly designed statistic information-Hollow Range and re-shuffles the elements in all pruned spaces for maximizing robustness and efficiency benefits meanwhile minimizing the invasion. We implement deepDP in PostgreSQL but are not limited to it, and evaluate deepDP on TPC-H, JOB, and our synthesis benchmark–DHR. The experiment results show that compared to traditional data pruning strategy, deepDP can improve multi-way equijoin query on efficiency by 3.5x.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call