Abstract

We consider the problem of efficiently estimating the size of the join of a collection of preprocessed relational tables from the perspective of instance optimality analysis. The running time of instance optimal algorithms is comparable to the minimum time needed to verify the correctness of a solution. Previously, instance optimal algorithms were only known when the size of the join was small (as one component of their running time was linear in the join size). We give an instance optimal algorithm for estimating the join size for all instances, including when the join size is large, by removing the dependency on the join size. As a byproduct, we show how to sample rows from the join uniformly at random in a comparable amount of time.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call