Fast algorithms for universal quantification in large databases

Goetz Graefe,Richard L Cole

doi:10.1145/210197.210202

Abstract

Universal quantification is not supported directly in most database systems despite the fact that it adds significant power to a system's query processing and inference capabilities, in particular for the analysis of many-to-many relationships and of set-valued attributes. One of the main reasons for this omission has been that universal quantification algorithms and their performance have not been explored for large databases. In this article, we describe and compare three known algorithms and one recently proposed algorithm for relational division, the algebra operator that embodies universal quantification. For each algorithm, we investigate the performance effects of explicit duplicate removal and referential integrity enforcement, variants for inputs larger than memory, and parallel execution strategies. Analytical and experimental performance comparisons illustrate the substantial differences among the algorithms. Moreover, comparisons demonstrate that the recently proposed division algorithm evaluates a universal quantification predicate over two relations as fast as hash (semi-) join evaluates an existential quantification predicate over the same relations. Thus, existential and universal quantification can be supported with equal efficiency by adding the recently proposed algorithm to a query evaluation system. A second result of our study is that universal quantification should be expressed directly in a database query language, because most query optimizers do not recognize the rather indirect formulations available in SQL as relational division and therefore produce very poor evaluation plans for many universal quantification queries.

Full Text