Cost-based Optimization Research Articles

The query optimization phase within a database management system (DBMS) ostensibly finds the fastest query execution plan from a potentially large set of enumerated plans, all of which correctly compute the same result of the specified query. Sometimes the cost-based optimizer selects a slower plan, for a variety of reasons. Previous work has focused on increasing the performance of specific components, often a single operator, within an individual DBMS. However, that does not address the fundamental question: from where does this suboptimality arise, across DBMSes generally? In particular, the contribution of each of many possible factors to DBMS suboptimality is currently unknown. To identify the root causes of DBMS suboptimality, we first introduce the notion of empirical suboptimality of a query plan chosen by the DBMS, indicated by the existence of a query plan that performs more efficiently than the chosen plan, for the same query. A crucial aspect is that this can be measured externally to the DBMS, and thus does not require access to its source code. We then propose a novel predictive model to explain the relationship between various factors in query optimization and empirical suboptimality. Our model associates suboptimality with the factors of complexity of the schema, of the underlying data on which the query is evaluated, of the query itself, and of the DBMS optimizer. The model also characterizes concomitant interactions among these factors. This model induces a number of specific hypotheses that were tested on multiple DBMSes. We performed a series of experiments that examined the plans for thousands of queries run on four popular DBMSes. We tested the model on over a million of these query executions, using correlational analysis, regression analysis, and causal analysis, specifically Structural Equation Modeling (SEM). We observed that the dependent construct of empirical suboptimality prevalence correlates positively with nine specific constructs characterizing four identified factors that explain in concert much of the variance of suboptimality of two extensive benchmarks, across these disparate DBMSes. This predictive model shows that it is the common aspects of these DBMSes that predict suboptimality, not the particulars embedded in the inordinate complexity of each of these DBMSes. This paper thus provides a new methodology to study mature query optimizers, identifies underlying DBMS-independent causes for the observed suboptimality, and quantifies the relative contribution of each of these causes to the observed suboptimality. This work thus provides a roadmap for fundamental improvements of cost-based query optimizers.

Read full abstract

This paper presents a comprehensive overview of Database Management Systems (DBMS) and their significance in modern information management. DBMS technology plays a crucial role in the storage, organisation, retrieval, and manipulation of vast amounts of data in various domains, ranging from business operations to scientific research. This abstract highlights the key aspects covered in the paper, including the evolution of DBMS, its architectural components, and the challenges and advancements in the field. The paper begins by discussing the historical development of DBMS, tracing its origins from file-based systems to the emergence of relational databases and the subsequent rise of object-oriented and NoSQL databases. We explore the motivations behind these advancements and their impact on data management. Next, we delve into the fundamental architectural components of a DBMS. We examine the storage layer, which encompasses data structures and access methods, and discuss different indexing techniques for efficient data retrieval. The query processing and optimization module are explored, focusing on query execution plans and cost-based optimization strategies. Additionally, we analyse the transaction management component, highlighting concepts such as ACID properties, concurrency control, and recovery mechanisms. The abstract also highlights the challenges faced by modern DBMS. With the proliferation of big data and the advent of cloud computing, scalability, availability, and performance have become critical concerns. We examine techniques such as parallel and distributed databases, replication, and sharding to address these challenges. Furthermore, we discuss the integration of DBMS with emerging technologies like machine learning and blockchain to leverage their capabilities in data analytics and secure data transactions. Lastly, the abstract touches upon recent advancements in DBMS, including the rise of graph databases for managing interconnected data, the adoption of in-memory databases for high-performance applications, and the exploration of new database models to handle unstructured and semi-structured data. In conclusion, this paper provides a comprehensive overview of DBMS, covering its historical evolution, architectural components, challenges, and recent advancements. By understanding the principles and advancements in DBMS, researchers and practitioners can effectively harness the power of data management systems to tackle the complexities of modern data-driven applications.

Read full abstract

Cost-based Optimization Research Articles

Related Topics

Articles published on Cost-based Optimization

AutoQuo: An Adaptive plan optimizer with reinforcement learning for query plan selection

A systematic approach for screening green entrainers combining the environmental-health-safety indexes and separation performance

Cost-based optimization, feasibility study, and sensitivity analysis of forward osmosis/crystallization/reverse osmosis with high-temperature operation for high-salinity seawater desalination

Identifying the Root Causes of DBMS Suboptimality

A novel cost-based optimization model for electric power distribution systems resilience improvement under dust storms

The effect of electric vehicle charging demand variability on optimal hybrid power systems with second-life lithium-ion or fresh Na–S batteries considering power quality

Dimensional Tolerances in Mechanical Assemblies: A Cost-Based Optimization Approach

DATA_SPHERE

FASTgres: Making Learned Query Optimizer Hinting Effective

Exploiting Structure in Regular Expression Queries

Using Cloud Functions as Accelerator for Elastic Data Analytics

Optimizing Tensor Programs on Flexible Storage

K-NN Query Optimization for High-Dimensional Index Using Machine Learning

Comparative Analysis Optimal Designs for Passive, Electrified, and Net Zero Energy Residential Buildings

Optimal planning and operation of distribution systems using network reconfiguration and flexibility services

Query Optimization Framework for Graph Database in Cloud Dew Environment

SkinnerMT

Cost-Based Optimization of Isolated Footing in Cohesive Soils Using Generalized Reduced Gradient Method

Cost-based Optimization of Multistore Query Plans

Cost-Based or Learning-Based?

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Cost-based Optimization Research Articles

Related Topics

Articles published on Cost-based Optimization

AutoQuo: An Adaptive plan optimizer with reinforcement learning for query plan selection

A systematic approach for screening green entrainers combining the environmental-health-safety indexes and separation performance

Cost-based optimization, feasibility study, and sensitivity analysis of forward osmosis/crystallization/reverse osmosis with high-temperature operation for high-salinity seawater desalination

Identifying the Root Causes of DBMS Suboptimality

A novel cost-based optimization model for electric power distribution systems resilience improvement under dust storms

The effect of electric vehicle charging demand variability on optimal hybrid power systems with second-life lithium-ion or fresh Na–S batteries considering power quality

Dimensional Tolerances in Mechanical Assemblies: A Cost-Based Optimization Approach

DATA_SPHERE

FASTgres: Making Learned Query Optimizer Hinting Effective

Exploiting Structure in Regular Expression Queries

Using Cloud Functions as Accelerator for Elastic Data Analytics

Optimizing Tensor Programs on Flexible Storage

K-NN Query Optimization for High-Dimensional Index Using Machine Learning

Comparative Analysis Optimal Designs for Passive, Electrified, and Net Zero Energy Residential Buildings

Optimal planning and operation of distribution systems using network reconfiguration and flexibility services

Query Optimization Framework for Graph Database in Cloud Dew Environment

SkinnerMT

Cost-Based Optimization of Isolated Footing in Cohesive Soils Using Generalized Reduced Gradient Method

Cost-based Optimization of Multistore Query Plans

Cost-Based or Learning-Based?