In the age of the cloud-native, container technology, referred as OS-level virtualization, is increasingly adopted to deploy cloud applications. Compared with virtual machines, containers are lightweight and flexible in resource management. An important quality-of-service (QoS) class in container management is burstable container, whose resource limits are higher than the actual requests allowing a container to expand whenever demands ramp up and additional resources become available. However, efficiently managing burstable containers is challenging, especially for CPU resources. On the one hand, burstable containers should maintain sufficient concurrency, in the form of threads, to utilize extendable CPU resources. On the other hand, the degree of concurrency necessary for utilizing peak CPU resources leads to suboptimal performance when a container's CPU allocation is constrained. In this paper, we recommend that the number of threads in burstable containers should always be set to the CPU limit to guarantee extensibility. However, modern operating systems (OSes) fall short of efficiently managing thread oversubscription. First, the OS CPU scheduler is inefficient for scheduling excessive threads and lacks container awareness. Second, the existing blocking synchronization supported by the OS kernel is inefficient in handling the sleep and wakeup of excessive threads. Finally, the non-blocking synchronization may waste CPUs performing busy waiting when more than one thread in the run queue. To this end, we present a user-level adaptive container scheduler and two OS mechanisms, <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">virtual blocking</i> and <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">busy-waiting detection</i> , to avoid inefficiency in managing burstable containers without requiring program code changes. Experimental results show that our approaches can keep burstable containers efficient while allowing the applications in containers to take advantage of additional CPUs. The performance gain under high system load is up to 29.7×.
Read full abstract