Tidy Tuples and Flying Start: fast compilation and fast execution of relational queries in Umbra

Timo Kersten,Viktor Leis,Thomas Neumann

doi:10.1007/s00778-020-00643-4

Timo Kersten, Viktor Leis + Show 1 more

Open Access

https://doi.org/10.1007/s00778-020-00643-4

Copy DOI

Journal: The VLDB Journal	Publication Date: Jun 2, 2021
Citations: 22	License type: open-access

Affiliation: Friedrich Schiller University Jena

Abstract

Although compiling queries to efficient machine code has become a common approach for query execution, a number of newly created database system projects still refrain from using compilation. It is sometimes claimed that the intricacies of code generation make compilation-based engines too complex. Also, a major barrier for adoption, especially for interactive ad hoc queries, is long compilation time. In this paper, we examine all stages of compiling query execution engines and show how to reduce compilation overhead. We incorporate the lessons learned from a decade of generating code in HyPer into a design that manages complexity and yields high speed. First, we introduce a code generation framework that establishes abstractions to manage complexity, yet generates code in a single fast pass. Second, we present a program representation whose data structures are tuned to support fast code generation and compilation. Third, we introduce a new compiler backend that is optimized for minimal compile time, and simultaneously, yields superior execution performance to competing approaches, e.g., Volcano-style or bytecode interpretation. We implemented these optimizations in our database system Umbra to show that it is possible to unite fast compilation and fast execution. Indeed, Umbra achieves unprecedentedly low query latencies. On small data sets, it is even faster than interpreter engines like DuckDB and PostgreSQL. At the same time, on large data sets, its throughput is on par with the state-of-the-art compiling system HyPer.

Highlights

Query compilation is a widely adopted approach for relational database systems [1,7,10,34,46]
– Umbra IR speeds up code generation (Sect. 5.3). – The Flying Start backend dominates multiple state-ofthe-art alternatives (Sect. 5.4). – The optimizations in the Flying Start backend all provide performance benefits (Sect. 5.5)
We conclude that Umbra IR speeds up code generation and serves its purpose well as it effectively reduces Umbra’s query latency

Summary

Introduction

Query compilation is a widely adopted approach for relational database systems [1,7,10,34,46]. Tidy Tuples uses Umbra IR as target for the code generator and source for all compilation backends This reduces the time to generate programs and to transform them to executables. Adaptive execution was introduced first to the HyPer query engine For query execution, it has a choice between using intensively optimized code for high-speed execution and two low-latency compilation backends. With the Flying Start backend we show a solution for the low-latency spectrum, i.e., short-running queries It generates code even faster than HyPer’s bytecode interpreter and the resulting execution speed is on par with HyPer’s LLVMgenerated code. Together, these three components achieve query latencies for short-running queries that previously were only possible using interpretation. Experimental results show that the triad is so effective at reducing latency that Umbra reaches the latency realms of interpretation-based engines like DuckDB and PostgreSQL, all while keeping the execu-

Tidy Tuples: a low-latency code generation framework

Background: compilation pipeline

Layer overview

From operators to instructions

SQL values

Primitive types for code generation

Host language integration

Control flow

Umbra IR structure

Umbra program representation

Constants and dead-code removal

DBMS-specific instructions

Comparison to LLVM IR

Flying Start backend

Minimal compile-time design

Machine register allocation

To be exact

Result info

Implementation of Flying Start

Evaluation

Experimental setup

Compilation time

Runtime performance robustness

Flying Start optimizations

Implementation effort

Summary

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Tidy Tuples and Flying Start: fast compilation and fast execution of relational queries in Umbra

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: The VLDB Journal

Lead the way for us

Similar Papers

Copy-and-patch compilation: a fast compilation algorithm for high-level languages and bytecode
Haoran Xu ... Fredrik Kjolstad
Proceedings of the ACM on Programming Languages | VOL. 5
Haoran Xu, et. al.Haoran Xu ... Fredrik Kjolstad
15 Oct 2021
Proceedings of the ACM on Programming Languages | VOL. 5

Efficient generation of machine code for query compilers
Henning Funke ... Jan Mühlig
-
Henning Funke, et. al.Henning Funke ... Jan Mühlig
14 Jun 2020
14 Jun 2020

Bringing Compiling Databases to RISC Architectures
Ferdinand Gruber ... Maximilian Bandle
Proceedings of the VLDB Endowment | VOL. 16
Ferdinand Gruber, et. al.Ferdinand Gruber ... Maximilian Bandle
01 Feb 2023
Proceedings of the VLDB Endowment | VOL. 16

Exploiting Repeated Structures and Vectorization in Modelica
Joseph Schuchart ... Ines Gubsch
-
Joseph Schuchart, et. al.Joseph Schuchart ... Ines Gubsch
18 Sep 2015
18 Sep 2015

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Tidy Tuples and Flying Start: fast compilation and fast execution of relational queries in Umbra

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: The VLDB Journal