Abstract

Auto-scaling is a technique that allocates resources according to dynamic workload. This paper focuses on auto-scaling with heterogeneous container configurations. The goal is to minimize the cost of container adjustments, and to reduce the resource insufficiency penalty, while maintaining high resource utilization. It is extremely difficult to achieve the minimal cost without knowing the future workloads in advance. Thus, we first propose an optimal dynamic programming algorithm that can scale optimally when given the future workload. This optimal solution is used as the baseline to evaluate other algorithms that do not have the future workload information. Then, we propose two greedy algorithms that do not need workload information in advance, and a heuristic algorithm that first predicts the workload of the next time step using Gradient Boosting Regression, then makes scaling decisions using the optimal dynamic programming algorithm. We evaluate these four algorithms with two realistic workload traces. The experiments show that when the cost to start new servers is much higher than resource insufficiency penalty, our short-term prediction approach will only increase the total cost by only 9.6%, and decrease the utilization by only 10%, when compared with the optimal dynamic programming that knows the future workload.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.