Don't forget about synchronization! Guidelines for using locks on graphics processing units

Jacob Nelson,Roberto Palmieri,Depaul Miller

doi:10.1002/cpe.5757

Abstract

SummaryHeterogeneous devices are becoming necessary components of high performance computing infrastructures, and the graphics processing unit (GPU) plays an important role in this landscape. Given a problem, the established approach for exploiting the GPU is to design solutions that are parallel, without data dependencies. These solutions are then offloaded to the GPU's massively parallel capability. This design principle often leads to developing applications that cannot maximize GPU hardware utilization. The goal of this article is to challenge this common belief by empirically showing that allowing even simple forms of synchronization enables programmers to design solutions that admit conflicts and achieve better performance. Our experience shows that lock‐based solutions to the k‐means clustering problem, implemented using two well‐known locking strategies, outperform the well‐engineered and parallel KMCUDA on both synthetic and real datasets; with an average 8× faster runtimes across all locking algorithms on a synthetic dataset and 1.7× faster on a real world dataset across all locking algorithms (and max speedups of 71.3× and 2.75×, respectively). We validate these results using a more sophisticated clustering algorithm, namely fuzzy c‐means and summarize our findings by identifying three guidelines to help make concurrency effective when programming GPU applications.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Don't forget about synchronization! Guidelines for using locks on graphics processing units

Abstract

Talk to us

Similar Papers

More From: Concurrency and Computation: Practice and Experience

Lead the way for us

Journal: Concurrency and Computation: Practice and Experience	Publication Date: Apr 13, 2020
Citations: 2

Similar Papers

Don't Forget About Synchronization!
Jacob Nelson ... Roberto Palmieri
-
Jacob Nelson, et. al.Jacob Nelson ... Roberto Palmieri
17 Feb 2019
17 Feb 2019

Real and synthetic data sets for benchmarking key-value stores focusing on various data types and sizes
Hyuk-Yoon Kwon
Data in Brief | VOL. 30
Hyuk-Yoon KwonHyuk-Yoon Kwon
20 Mar 2020
Data in Brief | VOL. 30

Vidushi: Parallel Implementation of Alpha Miner Algorithm and Performance Analysis on CPU and GPU Architecture
Divya Kundra ... Prerna Juneja
-
Divya Kundra, et. al.Divya Kundra ... Prerna Juneja
01 Jan 2015
01 Jan 2015

Accelerating DNA Variant Calling Algorithms on High Performance Computing Systems

-

17 Dec 2018
17 Dec 2018

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Don't forget about synchronization! Guidelines for using locks on graphics processing units

Abstract

Talk to us

Similar Papers

More From: Concurrency and Computation: Practice and Experience