Abstract

Multiprocessor operating systems (OSs) pose several unique and conflicting challenges to System Virtual Machines (System VMs). For example, most existing system VMs resort to gang scheduling a guest OS's virtual processors (VCPUs) to avoid OS synchronization overhead. However, gang scheduling is infeasible for some application domains, and is inflexible in other domains. In an overcommitted environment, an individual guest OS has more VCPUs than available physical processors (PCPUs), precluding the use of gang scheduling. In such an environment, we demonstrate a more than two-fold increase in application runtime when transparently virtualizing a chip-multiprocessor's cores. To combat this problem, we propose a hardware technique to detect when a VCPU is wasting CPU cycles, and preempt that VCPU to run a different, more productive VCPU. Our technique can dramatically reduce cycles wasted on OS synchronization, without requiring any semantic information from the software. We then present a server consolidation case study to demonstrate the potential of more flexible scheduling policies enabled by our technique. We propose one such policy that logically partitions the CMP cores between guest VMs. This policy increases throughput by 10-25 percent for consolidated server workloads due to improved cache locality and core utilization.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call