Conjwzctiue queries are queries over a relational database and are at the core of relational query languages such as SQL. Testing for containment (and equivalence) of such queries arises as part of many advanced features of query optimization, for example, using rnateriahzed views, processing correlated nested queries, semantic query optimization, and global query optimization. Earlier formal work on the topic has examined conjunctive queries over sets of tuples, where each query can be viewed as a function from sets to sets. Containment (and equivalence) of conjunctive queries has been naturally defined based on set mcluslon and has been shown to be an NP-complete problem. Even in SQL, however, queries over multzsets of tuples may be posed. In fact, relations are treated as multisets by default, with duplicates being ehmmated only after explicit requests Thus, in order to reason about containment/equivalence of a large class of SQL queries, it is necessary to consider a generalization of conjunctive queries, in which relations are interpreted as multmets of tuples: The view of a relation as a set of tuples must be generahzed. In this paper we study conjunctive queries over databases in which each tuple has an associated label. This generalized notion of a database allows us to consider relations that are mcsltzsets and relatlons that are fuzzy sets. As a special case, we can also model traditional set-relatlons by making the label associated with a tuple be either “true” (meaning that the tuple is in the relation) or “false” (meaning that the tuple is not in the relation). In order to keep our results general, we consider a variety of label systems, where each label system is essentially a set of conditions on the labels that can be associated with tuples. Once a result is established for a label system, it holds for all mterpretations of relatlons that satisfy these conditions. For example, we present a necessary and sufficient condition for containment of conjunctive queries for label systems of a type that abstracts both the traditional set-relations and fuzzy sets. We also present a different necessary and sufficient condition for containment of a restricted class of conjunctive queries for a label system that abstracts relations as multlsets. Finally, we show that containment of unions of conjunctwe queries is decidable for label systems of the first type and undecidable for label systems of the second type This result underscores the fundamental difference between viewing relations as sets and as multmets, and motivates a closer examination of relatlons as multisets, given them importance in SQL.
Read full abstract