LD

Pengcheng Li,Jacob Brock,Eddy Z Zhang,Hao Luo,Xiaoyu Hu,Dong Chen,Chen Ding

doi:10.1145/3046678

Abstract

Data race detection has become an important problem in GPU programming. Previous designs of CPU race-checking tools are mainly task parallel and incur high overhead on GPUs due to access instrumentation, especially when monitoring many thousands of threads routinely used by GPU programs. This article presents a novel data-parallel solution designed and optimized for the GPU architecture. It includes compiler support and a set of runtime techniques. It uses value-based checking, which detects the races reported in previous work, finds new races, and supports race-free deterministic GPU execution. More important, race checking is massively data parallel and does not introduce divergent branching or atomic synchronization. Its slowdown is less than 5 × for over half of the tests and 10 × on average, which is orders of magnitude more efficient than the cuda-memcheck tool by Nvidia and the methods that use fine-grained access instrumentation.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

LD

Abstract

Talk to us

Similar Papers

More From: ACM Transactions on Architecture and Code Optimization

Lead the way for us

Journal: ACM Transactions on Architecture and Code Optimization	Publication Date: Mar 21, 2017
Citations: 13

Similar Papers

Exploring compiler optimization opportunities for the OpenMP 4.x accelerator model on a POWER8+GPU platform
...
-
, et. al. ...
13 Nov 2016
13 Nov 2016

A Multi-Level Platform-Independent GPU API for High-Level Programming Models
Akihiro Hayashi ... Sri Raj Paul
-
Akihiro Hayashi, et. al.Akihiro Hayashi ... Sri Raj Paul
01 Jan 2021
01 Jan 2021

Static detection of uncoalesced accesses in GPU programs
Rajeev Alur ... Nimit Singhania
Formal Methods in System Design | VOL. 60
Rajeev Alur, et. al.Rajeev Alur ... Nimit Singhania
05 Mar 2021
Formal Methods in System Design | VOL. 60

Massively Parallel, Highly Efficient, but What About the Test Suite Quality? Applying Mutation Testing to GPU Programs
Qianqian Zhu ... Andy Zaidman
-
Qianqian Zhu, et. al.Qianqian Zhu ... Andy Zaidman
06 Aug 2020
06 Aug 2020

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

LD

Abstract

Talk to us

Similar Papers

More From: ACM Transactions on Architecture and Code Optimization