Speeding Up End-to-end Query Execution via Learning-based Progressive Cardinality Estimation

Fang Wang,Zunyao Mao,Man Lung Yiu,Xiao Yan,Shuai Li,Bo Tang

doi:10.1145/3588708

Abstract

Fast query execution requires learning-based cardinality estimators to have short inference time (as model inference time adds to end-to-end query execution time) and high estimation accuracy (which is crucial for finding good execution plan). However, existing estimators cannot meet both requirements due to the inherent tension between model complexity and estimation accuracy. We propose a novel Learning-based Progressive Cardinality Estimator (LPCE), which adopts a query re-optimization methodology. In particular, LPCE consists of an initial model (LPCE-I), which estimates cardinality before query execution, and a refinement model (LPCE-R), which progressively refines the cardinality estimations using the actual cardinalities of the executed operators. During query execution, re-optimization is triggered if the estimations of LPCE-I are found to have large errors, and more efficient execution plans are selected for the remaining operators using the refined estimations provided by LPCE-R. Both LPCE-I and LPCE-R are light-weight query-driven estimators but they achieve both good efficiency and high accuracy when used jointly. Besides designing the models for LPCE-I and LPCE-R, we also integrate re-optimization and LPCE into PostgreSQL, a popular database engine. Extensive experiments show that LPCE yields shorter end-to-end query execution time than state-of-the-art learning-based estimators.

Full Text