Configuration tuning for big-data software systems is generally challenging due to the complex configuration space and costly performance evaluation. The workload changes over time in the context of big data making this problem even more difficult. To address the low efficiency issue caused by the need to retune from scratch after workload changes, we present ETune based on a combination of an Bayesian Optimization (BO) based tuner and configuration space reduction techniques to efficiently find high-performance configurations for big-data systems. For configuration space reduction, we develop two approaches: (1) shrinking the configuration space by trading off the exploitation and exploration between the impactful parameters and other parameters in the tuning process, (2) generating the promising regions of configuration space by transferring knowledge across past tuning tasks. The two configuration space reduction methods aim to reduce the huge configuration space to a compact but promising one, and searching in the reduced configuration space can further accelerate the configuration tuning. The extensive experiments show that our configuration space reduction methods considerably boost the BO-based tuner, and ETune significantly improves the tuning efficiency compared with the transfer learning based state-of-the-arts.
Read full abstract