Abstract
SQL has emerged as an industry standard for querying relational database management systems, largely because a user need only specify what data is wanted, not the details of how to access that data. A query optimizer uses a mathematical model of query execution to determine automatically the best way to access and process any given SQL query. This model is heavily dependent upon the optimizer's estimates for the number of rows that will result at each step of the query execution plan (QEP), especially for complex queries involving many predicates and/or operations. These estimates rely upon statistics on the database and modeling assumptions that may or may not be true for a given database. In this paper we discuss an autonomic query optimizer that automatically self-validates its model without requiring any user interaction to repair incorrect statistics or cardinality estimates. By monitoring queries as they execute, the autonomic optimizer compares the optimizer's estimates with actuals at each step in a QEP, and computes adjustments to its estimates that may be used during future optimizations of similar queries. Moreover, estimation errors can also trigger re-optimization of a query in mid-execution. The autonomic refinement of the optimizer's model can result in a reduction of query execution time by orders of magnitude at negligible additional run-time cost. We discuss various research issues and practical considerations that we needed to address during our implementation of a first prototype of LEO, a LEarning Optimizer for DB2® (Database 2™)that learns table access cardinalities and for future queries corrects the estimation error for simple predicates by adjusting the database statistics of DB2.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have