Abstract Server consolidation and virtualization technologies enable multiple end-users to share a single physical server, substantially improving hardware utilization and reducing energy consumption. However, as modern multicore server architectures shift to non-uniform memory access (NUMA), the complex interplay between data access affinity and shared resource overhead continues to pose challenges to consolidation efficiency. In this paper, we first systematically characterize the performance impacts of server consolidation on NUMA systems with various cloud applications. We find that the virtual machine (VM) memory and network I/O access could both affect the cloud applications performance. Moreover, as consolidation density continues to grow, conventional approaches cannot manage the system loads and thus result in overall system performance degradation. Motivated by these two findings, we then propose a load-aware global resource affinity management framework (LG-RAM) that aims to optimize VM consolidation performance on NUMA systems. LG-RAM consists of three components: the VM resource access monitor quantifies the VM resource access behaviors, the shared resource load detector models the load on shared hardware resources, and the VM resource scheduler makes the scheduling decision according to the information from the above two components. Our evaluations on the two different systems indicate that, compared with state-of-the-art approaches, LG-RAM can exhibit an average throughput improvement of 41.5% and 54.2% on Intel and AMD NUMA machines, respectively. Additionally, LG-RAM only incurs an extra CPU usage of no more than 7% on average when consolidating 32 VMs.
Read full abstract