Abstract
Effective query optimization is a core feature of any database management system. While most query optimization techniques make use of simple metadata, such as cardinalities and other basic statistics, other optimization techniques are based on more advanced metadata including data dependencies, such as functional, uniqueness, order, or inclusion dependencies. This survey provides an overview, intuitive descriptions, and classifications of query optimization and execution strategies that are enabled by data dependencies. We consider the most popular types of data dependencies and focus on optimization strategies that target the optimization of relational database queries. The survey supports database vendors to identify optimization opportunities as well as DBMS researchers to find related work and open research questions.
Highlights
Increasing the performance of modern database management systems is a major objective of database research
To shorten the individual discussions, we define that, if not stated otherwise, all required data dependencies use the null = null semantics, which is the most commonly required interpretation and the default configuration for most dependency discovery and maintenance algorithms. Note that both semantics null = null and null != null are practical null interpretations. While this practical interpretation is very useful for our objective of query optimization, a more accurate interpretation of null values for data dependencies is no information [8], so that the validity of a dependency depends on whether we can find a substitution for all null values that makes the dependency true or we find that any substitution of all null values makes the dependency true [71,72]
We present various query optimization techniques that are enabled by the existence of Unique column combinations (UCCs). (Primary) keys are by definition UCCs, which, vice versa, serve as key candidates
Summary
Increasing the performance of modern database management systems is a major objective of database research In this context, research has accelerated the processing of queries through advances in different areas, such as utilization of new hardware technologies, improved implementations of database operators, or sophisticated query plan modifications as part of the query optimization process. We provide a reference matrix, which summarizes the optimizations for different types of data dependencies in different areas of application with regard to the query optimization process. Focus This survey focusses solely on utilizing data dependencies for effective query optimization and, does not consider optimizations that are not related to data dependencies.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have