Abstract

Today we are noticing a significant increase in energy costs used by High-Performance Computing. However, increasing demand for information processing have led to cheaper, faster and larger data management systems. This demand requires employing more hardware and software to meet the service needs which in turn put further pressure on energy costs. In data-centric applications, DBMSs are one of the major energy consumers. So faced to this situation, integrating energy in the database design becomes an economic necessity. To satisfy this key requirement, the development of cost models estimating the energy consumption is one of the relevant issues. While a number of recent papers have explored this problem, the majority of the existing work considers prediction energy for a single standalone query. In this paper, we consider a more general problem of multiple concurrently running queries. This is useful for many database management's tasks, including admission control, query scheduling and execution control with energy efficiency as a first-class performance goal. We propose a methodology to define an energy-consumption cost model to estimate the cost of executing concurrent workload via statistical regression techniques. We first use the optimizer's cost model to estimate the I/O and CPU requirements for each query pipeline in the workload, then we fit statistical models to the observed energy at these query pipelines, finally we use the combination of these models to predict concurrent workload energy consumption. To evaluate the quality of our cost model, we conduct experiments using a real DBMS with a dataset of TPC-H and TPC-DS benchmarks. The obtained results show the quality of our cost model.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call