A Survey on Advancing the DBMS Query Optimizer: Cardinality Estimation, Cost Model, and Plan Enumeration

Hai Lan,Yuwei Peng,Zhifeng Bao

doi:10.1007/s41019-020-00149-7

Hai Lan, Yuwei Peng + Show 1 more

Open Access

https://doi.org/10.1007/s41019-020-00149-7

Copy DOI

Journal: Data Science and Engineering	Publication Date: Jan 15, 2021
Citations: 46	License type: open-access

Affiliation: RMIT University, Wuhan University

Abstract

Query optimizer is at the heart of the database systems. Cost-based optimizer studied in this paper is adopted in almost all current database systems. A cost-based optimizer introduces a plan enumeration algorithm to find a (sub)plan, and then uses a cost model to obtain the cost of that plan, and selects the plan with the lowest cost. In the cost model, cardinality, the number of tuples through an operator, plays a crucial role. Due to the inaccuracy in cardinality estimation, errors in cost model, and the huge plan space, the optimizer cannot find the optimal execution plan for a complex query in a reasonable time. In this paper, we first deeply study the causes behind the limitations above. Next, we review the techniques used to improve the quality of the three key components in the cost-based optimizer, cardinality estimation, cost model, and plan enumeration. We also provide our insights on the future directions for each of the above aspects.

Highlights

Query optimizer is at the heart of relational database management systems (RDBMSes) and some big data process engines, e.g., SCOPE [7]
We focus on the query optimizer and give a comprehensive survey on the three key components of the optimizer
Cardinality estimation is the ability to estimate the tuples generated by an operator and is used in the cost model to calculate the cost of that operator

Summary

Introduction

Query optimizer is at the heart of relational database management systems (RDBMSes) and some big data process engines, e.g., SCOPE [7]. Given a query written in a declarative language (e.g., SQL), the optimizer finds the most efficient execution plan ( called physical plan) and feeds it to the executor. Provided that the estimated cardinality and cost are accurate, and plan enumeration component can efficiently walk through the huge search space, this architecture can obtain the optimal execution plan in a reasonable time.

A Survey on Advancing the DBMS Query Optimizer

Cardinality Estimation

Cost Model

Plan Enumeration

Synopsis‐Based Methods

Histogram

Sketch

Other Techniques

Sampling‐Based Methods

Model predicates

Supervised Methods

Unsupervised Methods

Methods

Summaries

Possible Future Directions

Quality Improvement of Existing Cost Model

Cost Model Alternatives

Dynamic Programming

Top‐Down Strategies

Large Queries

Others

Learning‐Based Methods

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

A Survey on Advancing the DBMS Query Optimizer: Cardinality Estimation, Cost Model, and Plan Enumeration

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Data Science and Engineering

Lead the way for us

Similar Papers

Analysis and Improvement of Optimizer for Query Processing on Graph Store
Youyang Yao ... Rong Chen
-
Youyang Yao, et. al.Youyang Yao ... Rong Chen
27 Aug 2018
27 Aug 2018

COMPASS: Online Sketch-based Query Optimization for In-Memory Databases
Yesdaulet Izenov ... Florin Rusu
-
Yesdaulet Izenov, et. al.Yesdaulet Izenov ... Florin Rusu
09 Jun 2021
09 Jun 2021

A Model for Building Dynamic Indexes & Storage and Re-use of Optimal Query Plans Generated thru Progressive Optimization (POP)
Sreekumar Vobugari ... D V L N Somayajulu
International journal of machine learning and computing | VOL. -
Sreekumar Vobugari, et. al.Sreekumar Vobugari ... D V L N Somayajulu
01 Jan 2012
International journal of machine learning and computing | VOL. -

Flow-loss
Parimarjan Negi ... Nesime Tatbul
Proceedings of the VLDB Endowment | VOL. 14
Parimarjan Negi, et. al.Parimarjan Negi ... Nesime Tatbul
01 Jul 2021
Proceedings of the VLDB Endowment | VOL. 14

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A Survey on Advancing the DBMS Query Optimizer: Cardinality Estimation, Cost Model, and Plan Enumeration

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Data Science and Engineering