Scalable Adaptive NUMA-Aware Lock

Mingzhe Zhang,Luwei Cheng,Cho-Li Wang,Francis C M Lau,Haibo Chen

doi:10.1109/tpds.2016.2630695

Abstract

Scalable locking is a key building block for scalable multi-threaded software. Its performance is especially critical in multi-socket, multi-core machines with non-uniform memory access (NUMA). Previous schemes such as in-place locks and delegation locks only perform well under a certain level of contention, and often require non-trivial tuning for a particular configuration. Besides, in large NUMA systems, current delegation locks cannot perform satisfactorily due to lack of optimized NUMA policies. In this work, we propose SANL, a locking scheme that can deliver high performance under various contention levels by adaptively switching between in-place locks and delegation locks. To optimize the performance of delegation locks, we introduce a new NUMA policy that jointly considers node distances and server utilization when choosing lock servers. We have implemented SANL and evaluated it with four popular multi-threaded applications (Memcached, Berkeley DB, Phoenix2 and SPLASH-2), on a 40-core Intel machine and a 64-core AMD machine. The comparison results with seven other representative locking schemes show that SANL outperforms them in most contention situations. For example, in one group test, SANL is 3.7 times faster than RCL lock and 17 times faster than POSIX mutex.

Full Text