Thread-Level Locking for SIMT Architectures

Lan Gao,Rui Wang,Zhongzhi Luan,Yunlong Xu,Zhibin Yu,Depei Qian

doi:10.1109/tpds.2019.2955705

Abstract

As more emerging applications are moving to GPUs, thread-level synchronization has become a requirement. However, GPUs only provide warp-level and thread-block-level rather than thread-level synchronization. Moreover, it is highly possible to cause live-locks by using CPU synchronization mechanisms to implement thread-level synchronization for GPUs. In this article, we first propose a software-based thread-level synchronization mechanism called lock stealing for GPUs to avoid live-locks. We then describe how to implement our lock stealing algorithm in mutual exclusive locks and readers-writer locks with high performance. Finally, by putting it all together, we develop a thread-level locking library (TLLL) for commercial GPUs. To evaluate TLLL and show its general applicability, we use it to implement six widely used programs. We compare TLLL against the state-of-the-art ad-hoc GPU synchronization, GPU software transactional memory (STM), and CPU hardware transactional memory (HTM), respectively. The results show that, compared with the ad-hoc GPU synchronization for Delaunay mesh refinement (DMR), TLLL improves the performance by 22 percent on average on a GTX970 GPU, and shows up to 11 percent of performance improvement on a Volta V100 GPU. Moreover, it significantly reduces the required memory size. Such low memory consumption enables DMR to successfully run on the GTX970 GPU with the 10-million mesh size, and the V100 GPU with the 40-million mesh size, with which the ad-hoc synchronization can not run successfully. In addition, TLLL outperforms the GPU STM by 65 percent, and the CPU HTM (running on a Xeon E5-2620 v4 CPU with 16 hardware threads) by 43 percent on average.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Thread-Level Locking for SIMT Architectures

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Parallel and Distributed Systems

Lead the way for us

Journal: IEEE Transactions on Parallel and Distributed Systems	Publication Date: May 1, 2020
Citations: 29

Similar Papers

DracoSTM
Justin E Gottschlich ... Daniel A Connors
-
Justin E Gottschlich, et. al.Justin E Gottschlich ... Daniel A Connors
01 Jan 2007
01 Jan 2007

Architectural Support for Software Transactional Memory
Bratin Saha ... Quinn Jacobson
-
Bratin Saha, et. al.Bratin Saha ... Quinn Jacobson
01 Dec 2006
01 Dec 2006

Hybrid STM/HTM for nested transactions on OpenJDK
Keith Chapman ... J Eliot B Moss
ACM SIGPLAN Notices | VOL. 51
Keith Chapman, et. al.Keith Chapman ... J Eliot B Moss
19 Oct 2016
ACM SIGPLAN Notices | VOL. 51

Efficient Use of Hardware Transactional Memory for Parallel Mesh Generation
Tetsu Kobayashi ... Shigeyuki Sato
-
Tetsu Kobayashi, et. al.Tetsu Kobayashi ... Shigeyuki Sato
01 Sep 2015
01 Sep 2015

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Thread-Level Locking for SIMT Architectures

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Parallel and Distributed Systems