Sample-Efficient Cardinality Estimation Using Geometric Deep Learning

Silvan Reiner,Michael Grossniklaus

doi:10.14778/3636218.3636229

Abstract

In database systems, accurate cardinality estimation is a cornerstone of effective query optimization. In this context, estimators that use machine learning have shown significant promise. Despite their potential, the effectiveness of these learned estimators strongly depends on their ability to learn from small training sets. This paper presents a novel approach for learned cardinality estimation that addresses this issue by enhancing sample efficiency. We propose a neural network architecture informed by geometric deep learning principles that represents queries as join graphs. Furthermore, we introduce an innovative encoding for complex predicates, treating their encoding as a feature selection problem. Additionally, we devise a regularization term that employs equalities of the relational algebra and three-valued logic, augmenting the training process without requiring additional ground truth cardinalities. We rigorously evaluate our model across multiple benchmarks, examining q-errors, runtimes, and the impact of workload distribution shifts. Our results demonstrate that our model significantly improves the end-to-end runtimes of PostgreSQL, even with cardinalities gathered from as little as 100 query executions.

Full Text