Recognizing the diversity of Big Data analytic jobs, cloud providers offer a wide range of VM instance types or even clusters to cater for different use cases. The choice of cloud configurations can have a significant impact on the response time and running cost of batch-processing applications, which may need to be re-run regularly with cloud-scale resources. However, identifying the best cloud configuration with a low search cost is quite challenging due to i) the large and high-dimensional configuration space, ii) the time-varying cloud service cost (e.g., AWS Spot instances), and iii) job response time variation even given the same configuration. To tackle these challenges, we design and implement <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">Accordia</i> , a system that enables Adaptive Cloud Configuration Optimization for Recurring Data-Intensive Applications. By leveraging recent algorithmic advances in Gaussian Process UCB techniques, <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">Accordia</i> can unearth the cost-optimal configuration with a deadline constraint (i.e., maximum tolerated running time) under the time-varying cloud service cost. More importantly, <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">Accordia</i> manages to achieve a theoretical performance guarantee, <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">sub-linearly increasing dynamic regret</i> of the job completion cost. Using extensive trace-driven simulations and empirical measurements of our Kubernetes-based implementation, we demonstrate that <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">Accordia</i> can identify a near-cost-optimal configuration (i.e., within 10% of the optimum) after fewer than 20 runs from over 7000 candidate choices, which translates to a 2X-speedup and up to 17.9% cost-savings, when comparing to the state-of-the-art approach, <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">CherryPick</i> .
Read full abstract