Queue Delegation Locking

David Klaftenegger,Kjell Winblad,Konstantinos Sagonas

doi:10.1109/tpds.2017.2767046

Abstract

The scalability of parallel programs is often bounded by the performance of synchronization mechanisms used to protect critical sections. The performance of these mechanisms is in turn determined by their sequential execution time, efficient use of hardware, and ability to avoid waiting. In this article, we describe queue delegation (QD) locking , a family of locks that both delegate critical sections and enable detaching execution. Threads delegate work to the thread currently holding the lock and are able to detach , i.e., immediately continue their execution until they need a result from a previously delegated critical section. We show how to use queue delegation to build synchronization algorithms with lower overhead and higher throughput than existing algorithms, even when critical sections need to communicate results back immediately. Experiments when using up to 64 threads to access a shared priority queue show that QD locking provides 10 times higher throughput than Pthreads mutex locks and outperforms leading delegation algorithms. Also, when mixing parallel reads with delegated write operations, QD locking outperforms competing algorithms with an advantage ranging from 9.5 up to 207 percent increased throughput. Last but not least, continuing execution instead of waiting for the execution of critical sections leads to increased parallelism and better scalability. As we will see, queue delegation locking uses simple building blocks whose overhead is low even in uncontended use. All these make the technique useful in a wide variety of applications.

Full Text