Bao: Making Learned Query Optimization Practical

Ryan Marcus,Parimarjan Negi,Tim Kraska,Mohammad Alizadeh,Hongzi Mao,Nesime Tatbul

doi:10.1145/3448016.3452838

Ryan Marcus, Parimarjan Negi + Show 4 more

Open Access

https://doi.org/10.1145/3448016.3452838

Copy DOI

Abstract

Query optimization remains one of the most challenging problems in data management systems. Recent efforts to apply machine learning techniques to query optimization challenges have been promising, but have shown few practical gains due to substantive training overhead, inability to adapt to changes, and poor tail performance. Motivated by these difficulties and drawing upon a long history of research in multi-armed bandits, we introduce Bao (the BAndit Optimizer). Bao takes advantage of the wisdom built into existing query optimizers by providing per-query optimization hints. Bao combines modern tree convolutional neural networks with Thompson sampling, a decades-old and well-studied reinforcement learning algorithm. As a result, Bao automatically learns from its mistakes and adapts to changes in query workloads, data, and schema. Experimentally, we demonstrate that Bao can quickly (an order of magnitude faster than previous approaches) learn strategies that improve end-to-end query execution performance, including tail latency. In cloud environments, we show that Bao can offer both reduced costs and better performance compared with a sophisticated commercial system.

Highlights

Query optimization is an important task for database management systems
When users issue an EXPLAIN query, three additional pieces of information are added to the output: (1) the expected performance of the generated query plan, (2) the hint set that Bao would recommend if it were in active mode, and (3) the predicted improvement that hint set would provide
Neo [51] showed that deep reinforcement learning could be applied directly to query latency, and could learn optimization strategies that were competitive with commercial systems after 24 hours of training

Summary

INTRODUCTION

This is a desirable feature because reading data from cache is significantly faster than reading information off of disk, and it is possible that the best plan for a query changes based on what is cached While integrating such a feature into a traditional cost-based optimizer may require significant engineering and hand-tuning, making Bao cache-aware is as simple as surfacing a description of the cache state. By using only a limited set of hints, Bao has a restricted action space, and Bao is not always able to learn the best possible query plan Despite this restriction, in our experiments, Bao is still able to significantly outperform traditional optimizers while training and adjusting to change orders-of-magnitudes faster than “unrestricted” learned query optimizers, like Neo [51]. For the first time, we demonstrate a learned query optimization system that outperforms both open source and commercial systems in cost and latency, all while adapting to changes in workload, data, and schema

SYSTEM MODEL

SELECTING QUERY HINTS

Predictive model

Training loop

POSTGRESQL INTEGRATION

RELATED WORK

EXPERIMENTS

Is Bao practical?

What hints make the biggest difference?

Bao’s machine learning model

Findings

CONCLUSION AND FUTURE WORK

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Bao: Making Learned Query Optimization Practical

Abstract

Highlights

Summary

Talk to us

Similar Papers

Lead the way for us

Publication Date: Jun 9, 2021
Citations: 70	License type: cc-by

Similar Papers

Bao
Ryan Marcus ... Hongzi Mao
ACM SIGMOD Record | VOL. 51
Ryan Marcus, et. al.Ryan Marcus ... Hongzi Mao
31 May 2022
ACM SIGMOD Record | VOL. 51

Thompson -Greedy Algorithm: An Improvement to the Regret of Thompson Sampling and -Greedy on Multi-Armed Bandit Problems
Junpu Yu
Applied and Computational Engineering | VOL. 8
Junpu YuJunpu Yu
01 Aug 2023
Applied and Computational Engineering | VOL. 8

Program Placement Optimization for Storage-constrained Mobile Edge Computing Systems: A Multi-armed Bandit Approach
Mingjie Feng ... Marwan Krunz
-
Mingjie Feng, et. al.Mingjie Feng ... Marwan Krunz
01 Jun 2021
01 Jun 2021

Adaptive OFDM Based on Thompson Sampling Algorithm without Channel Knowledge
Haipeng Luo ... Xiaoran Liu
-
Haipeng Luo, et. al.Haipeng Luo ... Xiaoran Liu
23 Sep 2022
23 Sep 2022

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Bao: Making Learned Query Optimization Practical

Abstract

Highlights

Summary

Talk to us

Similar Papers