Abstract

In the most popular serverless platform - Knative, dynamic resource allocation is implemented using horizontal auto-scaling algorithms to create or delete service instances based on different monitored metrics. However, the assigned resources for each instance are fixed. Vertical scaling up or down assigned resources per instance is required to avoid over-provisioning resources which are limited at the edge. Hybrid (horizontal and vertical) auto-scaling solutions proposed by existing works have several limitations. These solutions are optimized for separated services and get degraded performance when applied in a normal environment with multiple concurrent services. Further, most methods make significant changes to the original Knative platform, and have not been considered to be adopted since then. In this article, instead of Knative modification, we developed separated Kubernetes operators and custom resources (CRs) that can assist the Knative auto-scaler with optimal hybrid auto-scaling configurations based on traffic prediction. First, we characterize each service with a profile of different assigned resource levels pairing with their optimal target Knative’s horizontal scaling request concurrency. Then, based on these profiles, we calculate the best-assigned resources level, target concurrency level, and the number of required instances corresponding to each future time step’s predicted traffic. Finally, these configurations are applied to Knative’s default auto-scaler and services’ CR. Experiments done on our testbed compared our solution with a Knative hybrid auto-scaler solution that does not consider the service’s target request concurrency, and the default Knative horizontal auto-scaler. The results show our solution improvements up to 14% and 20% in terms of resource usage, respectively.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call