Abstract

In database systems, accurate cardinality estimation is a cornerstone of effective query optimization. In this context, estimators that use machine learning have shown significant promise. Despite their potential, the effectiveness of these learned estimators strongly depends on their ability to learn from small training sets. This paper presents a novel approach for learned cardinality estimation that addresses this issue by enhancing sample efficiency. We propose a neural network architecture informed by geometric deep learning principles that represents queries as join graphs. Furthermore, we introduce an innovative encoding for complex predicates, treating their encoding as a feature selection problem. Additionally, we devise a regularization term that employs equalities of the relational algebra and three-valued logic, augmenting the training process without requiring additional ground truth cardinalities. We rigorously evaluate our model across multiple benchmarks, examining q-errors, runtimes, and the impact of workload distribution shifts. Our results demonstrate that our model significantly improves the end-to-end runtimes of PostgreSQL, even with cardinalities gathered from as little as 100 query executions.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call