Scalable queue-based spin locks with timeout

Michael L Scott,William N Scherer

doi:10.1145/568014.379566

Abstract

Queue-based spin locks allow programs with busy-wait synchronization to scale to very large multiprocessors, without fear of starvation or performance-destroying contention. So-called try locks , traditionally based on non-scalable test-and-set locks, allow a process to abandon its attempt to acquire a lock after a given amount of time. The process can then pursue an alternative code path, or yield the processor to some other process. We demonstrate that it is possible to obtain both scalability and bounded waiting, using variants of the queue-based locks of Craig, Landin, and Hagersten, and of Mellor-Crummey and Scott. A process that decides to stop waiting for one of these new locks can ``link itself out of line'' atomically. Single-processor experiments reveal performance penalties of 50--100\% for the CLH and MCS try locks in comparison to their standard versions; this marginal cost decreases with larger numbers of processors. We have also compared our queue-based locks to a traditional \tatas\ lock with exponential backoff and timeout. At modest (non-zero) levels of contention, the queued locks sacrifice cache locality for fairness, resulting in a worst-case 3X performance penalty. At high levels of contention, however, they display a 1.5--2X performance advantage , with significantly more regular timings and significantly higher rates of acquisition prior to timeout.

Full Text