Abstract

A query Q is boundedly evaluable under a set A of access constraints if for all datasets D that satisfy A, there exists a fraction DQ of D such that Q(D) = Q(DQ), and the size of DQ and time for identifying DQ are both independent of the size of D. That is, we can compute Q(D) by accessing a bounded amount of data no matter how big D grows. However, while desirable, it is undecidable to determine whether a query in relational algebra (RA) is bounded under A. In light of the undecidability, this paper develops an effective syntax for bounded RA queries. We identify a class of covered RA queries such that under A, (a) every boundedly evaluable RA query is equivalent to a covered query, (b) every covered RA query is boundedly evaluable, and (c) it takes PTIME in |Q| and |A| to check whether Q is covered by A. We provide quadratic-time algorithms to check the coverage of Q, and to generate a bounded query plan for covered Q. We also study a new optimization problem for minimizing access constraints for covered queries. Using real-life data, we experimentally verify that a large number of RA queries in practice are covered, and that bounded query plans improve RA query evaluation by orders of magnitude.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call