APPLES: Efficiently Handling Spin-lock Synchronization on Virtualized Platforms

Jianchen Shan,Narain Gehani,Xiaoning Ding

doi:10.1109/tpds.2016.2625249

Jianchen Shan, Narain Gehani + Show 1 more

Open Access

https://doi.org/10.1109/tpds.2016.2625249

Copy DOI

Abstract

Spin-locks are widely used in software for efficient synchronization. However, they cause serious performance degradation on virtualized platforms, such as the Lock Holder Preemption (LHP) problem and the Lock Waiter Preemption (LWP) problem, due to excessive spinning by virtual CPUs (VCPUs). The excessive spinning occurs when a VCPU waits to acquire a spin-lock. To address the performance degradation, hardware facilities, such as Intel PLE and AMD PF, are provided on processors to preempt VCPUs when they spin excessively. Although these facilities have been predominantly used on mainstream virtualization systems, using them in a manner that achieves the highest performance is still a challenging issue. There are two core problems in using these hardware facilities to reduce excessive spinning. One is to determine the best time to preempt a spinning VCPU (i.e., the selection of spinning thresholds). The other is which VCPU should be scheduled to run after the spinning VCPU is descheduled. Due to the semantic gap between different software layers, the virtual machine monitor (VMM) does not have information about the computation characteristics on VCPUs, which is needed to address the above problems. This makes the problems inherently challenging. We propose a framework named AdPtive Pause-Loop Exiting and Scheduling (APPLES) to address these problems. APPLES monitors the overhead caused by excessive spinning and preempting spinning VCPUs, and periodically adjusts spinning thresholds to reduce the overhead. APPLES also evaluates and schedules “ready” VCPUs in a VM by their potential to reduce the spinning incurred by the spin-lock synchronization. The evaluation is based on the causality and the time of VCPU preemptions. The implementation of APPLES incurs only minimal changes to existing systems (about 100 lines of code in KVM). Experiments show that APPLES can improve performance by 3 <inline-formula><tex-math notation="LaTeX"> $\sim$</tex-math></inline-formula> 49 percent (14 percent on average) for the workloads with frequent spin-lock operations.

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: IEEE Transactions on Parallel and Distributed Systems	Publication Date: Jul 1, 2017
Citations: 6	License type: publisher-specific, author manuscript

R Discovery Prime

R Discovery Prime

APPLES: Efficiently Handling Spin-lock Synchronization on Virtualized Platforms

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Parallel and Distributed Systems

Lead the way for us

Similar Papers

APLE: Addressing Lock Holder Preemption Problem with High Efficiency
Jianchen Shan ... Narain Gehani
-
Jianchen Shan, et. al.Jianchen Shan ... Narain Gehani
01 Nov 2015
01 Nov 2015

Diagnosing Virtualization Overhead for Multi-threaded Computation on Multicore Platforms
Xiaoning Ding ... Jianchen Shan
-
Xiaoning Ding, et. al.Xiaoning Ding ... Jianchen Shan
01 Nov 2015
01 Nov 2015

Mitigating excessive vCPU spinning in VM-agnostic KVM
Kenta Ishiguro ... Naoki Yasuno
-
Kenta Ishiguro, et. al.Kenta Ishiguro ... Naoki Yasuno
07 Apr 2021
07 Apr 2021

A lock-aware virtual machine scheduling scheme for synchronization performance
Chao Yu ... Leihua Qin
The Journal of Supercomputing | VOL. 75
Chao Yu, et. al.Chao Yu ... Leihua Qin
18 Nov 2015
The Journal of Supercomputing | VOL. 75

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

APPLES: Efficiently Handling Spin-lock Synchronization on Virtualized Platforms

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Parallel and Distributed Systems