Distribution-based query scheduling

Yun Chi,Wang-Pin Hsiung,Jeffrey F Naughton,Hakan Hacígümüş

doi:10.14778/2536360.2536367

Abstract

Query scheduling, a fundamental problem in database management systems, has recently received a renewed attention, perhaps in part due to the rise of the "database as a service" (DaaS) model for database deployment. While there has been a great deal of work investigating different scheduling algorithms, there has been comparatively little work investigating what the scheduling algorithms can or should know about the queries to be scheduled. In this work, we investigate the efficacy of using histograms describing the distribution of likely query execution times as input to the query scheduler. We propose a novel distribution-based scheduling algorithm, Shepherd, and show that Shepherd substantially outperforms state-of-the-art point-based methods through extensive experimentation with both synthetic and TPC workloads.

Full Text