Abstract

Effective query optimization is a core feature of any database management system. While most query optimization techniques make use of simple metadata, such as cardinalities and other basic statistics, other optimization techniques are based on more advanced metadata including data dependencies, such as functional, uniqueness, order, or inclusion dependencies. This survey provides an overview, intuitive descriptions, and classifications of query optimization and execution strategies that are enabled by data dependencies. We consider the most popular types of data dependencies and focus on optimization strategies that target the optimization of relational database queries. The survey supports database vendors to identify optimization opportunities as well as DBMS researchers to find related work and open research questions.

Highlights

  • Increasing the performance of modern database management systems is a major objective of database research

  • To shorten the individual discussions, we define that, if not stated otherwise, all required data dependencies use the null = null semantics, which is the most commonly required interpretation and the default configuration for most dependency discovery and maintenance algorithms. Note that both semantics null = null and null != null are practical null interpretations. While this practical interpretation is very useful for our objective of query optimization, a more accurate interpretation of null values for data dependencies is no information [8], so that the validity of a dependency depends on whether we can find a substitution for all null values that makes the dependency true or we find that any substitution of all null values makes the dependency true [71,72]

  • We present various query optimization techniques that are enabled by the existence of Unique column combinations (UCCs). (Primary) keys are by definition UCCs, which, vice versa, serve as key candidates

Read more

Summary

Efficient querying with data dependencies

Increasing the performance of modern database management systems is a major objective of database research In this context, research has accelerated the processing of queries through advances in different areas, such as utilization of new hardware technologies, improved implementations of database operators, or sophisticated query plan modifications as part of the query optimization process. We provide a reference matrix, which summarizes the optimizations for different types of data dependencies in different areas of application with regard to the query optimization process. Focus This survey focusses solely on utilizing data dependencies for effective query optimization and, does not consider optimizations that are not related to data dependencies.

Query optimization
Data dependencies
Data dependency properties
Classification of dependency-driven query optimization techniques
Unique column combinations
UCCs and joins
UCCs and grouping and aggregation
UCCs and distinctness
UCCs and subqueries
UCCs and set operations
Further optimization opportunities with UCCs
Functional dependencies
FDs and grouping
FDs and joins
FDs and selection
FDs and sorting
Further optimization opportunities
Order dependencies
ODs and sorting
ODs and joins
ODs and grouping
Further optimization opportunities with ODs
INDs and joins
Further optimization opportunities with INDs
Further optimizations
Semantic query optimization
Further dependency types
10 Summary and outlook
60. International Organization for Standardization
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call