CORRELATIONS AND QUERY PROCESSING

Bhanu Shanker Prasad

doi:10.21474/ijar01/11726

Abstract

It is known that optimization of join queries based on average selectivities is sub-optimal in highly correlated databases. Relations are naturally divided into partitions , each partition having substantially different statistical characteristics in such databases. It is very compelling to discover such data partitions during query optimization and create multiple plans for a given query , one plan being optimal for a particular combination of data partitions. This scenario calls for the sharing of state among plans, so that common intermediate results are not recomputed. We study this problem in a setting with a routing-based query execution engine based on eddies. Eddies naturally encapsulate horizontal partitioning and maximal state sharing across multiple plan. The purpose of this paper is to present faster execution time over traditional optimization for high correlations, while maintaining the same performance for low correlations.

Highlights

It is known that optimization of join queries based on average selectivities is sub-optimal in highly correlated databases
Relations are naturally divided into partitions, each partition having substantially different statistical characteristics in such databases
When data correlations are present, the input relations are naturally divided into partitions, each partition having completely different statistical characteristics

Summary

Introduction

It is known that optimization of join queries based on average selectivities is sub-optimal in highly correlated databases. Relations are naturally divided into partitions , each partition having substantially different statistical characteristics in such databases Traditional query optimizers pick one execution plan per query, based on first-order statistics about the underlying data. The presence of data correlations does make selectivity estimation harder-it offers opportunities for more effective query optimization. When data correlations are present, the input relations are naturally divided into partitions, each partition having completely different statistical characteristics. It is very attractive to create multiple plans per query, each plan being optimized for a different combination of data partitions. The combined cost of the two resulting plans can be smaller than the cost of any possible monolithic plan

Objectives

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

CORRELATIONS AND QUERY PROCESSING

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: International Journal of Advanced Research

Lead the way for us

Journal: International Journal of Advanced Research	Publication Date: Sep 30, 2020
License type: cc-by

Similar Papers

Sharing-aware horizontal partitioning for exploiting correlations during query processing
Kostas Tzoumas ... Amol Deshpande
Proceedings of the VLDB Endowment | VOL. 3
Kostas Tzoumas, et. al.Kostas Tzoumas ... Amol Deshpande
01 Sep 2010
Proceedings of the VLDB Endowment | VOL. 3

A Genetic Algorithm for Selecting Horizontal Fragments
Ladjel Bellatreche
-
Ladjel BellatrecheLadjel Bellatreche
01 Jan 2009
01 Jan 2009

Query Execution and Optimization
Stratis D Viglas
-
Stratis D ViglasStratis D Viglas
01 Jan 2004
01 Jan 2004

Multi-objective parametric query optimization
Immanuel Trummer ... Christoph Koch
Communications of the ACM | VOL. 60
Immanuel Trummer, et. al.Immanuel Trummer ... Christoph Koch
25 Sep 2017
Communications of the ACM | VOL. 60

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

CORRELATIONS AND QUERY PROCESSING

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: International Journal of Advanced Research