Benchmarking and Evaluating Unified Memory for OpenMP GPU Offloading

Alok Mishra,Hal Finkel,Barbara Chapman,Lingda Li,Martin Kong

doi:10.1145/3148173.3148184

Abstract

The latest OpenMP standard offers automatic device offloading capabilities which facilitate GPU programming. Despite this, there remain many challenges. One of these is the unified memory feature introduced in recent GPUs. GPUs in current and future HPC systems have enhanced support for unified memory space. In such systems, CPU and GPU can access each other's memory transparently, that is, the data movement is managed automatically by the underlying system software and hardware. Memory over subscription is also possible in these systems. However, there is a significant lack of knowledge about how this mechanism will perform, and how programmers should use it. We have modified several benchmarks codes, in the Rodinia benchmark suite, to study the behavior of OpenMP accelerator extensions and have used them to explore the impact of unified memory in an OpenMP context. We moreover modified the open source LLVM compiler to allow OpenMP programs to exploit unified memory. The results of our evaluation reveal that, while the performance of unified memory is comparable with that of normal GPU offloading for benchmarks with little data reuse, it suffers from significant overhead when GPU memory is over subcribed for benchmarks with large amount of data reuse. Based on these results, we provide several guidelines for programmers to achieve better performance with unified memory.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Benchmarking and Evaluating Unified Memory for OpenMP GPU Offloading

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Impact of Inter-application Contention in Current and Future HPC Systems
Ana Jokanovic ... German Rodriguez
-
Ana Jokanovic, et. al.Ana Jokanovic ... German Rodriguez
01 Aug 2010
01 Aug 2010

Compiler assisted hybrid implicit and explicit GPU memory management under unified address space
Lingda Li ... Barbara Chapman
-
Lingda Li, et. al.Lingda Li ... Barbara Chapman
17 Nov 2019
17 Nov 2019

MemHC: An Optimized GPU Memory Management Framework for Accelerating Many-body Correlation
Qihan Wang ... Robert G Edwards
ACM Transactions on Architecture and Code Optimization | VOL. 19
Qihan Wang, et. al.Qihan Wang ... Robert G Edwards
24 Mar 2022
ACM Transactions on Architecture and Code Optimization | VOL. 19

XUnified: A Framework for Guiding Optimal Use of GPU Unified Memory
Hailu Xu ... Murali Emani
IEEE Access | VOL. 10
Hailu Xu, et. al.Hailu Xu ... Murali Emani
01 Jan 2021
IEEE Access | VOL. 10

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Benchmarking and Evaluating Unified Memory for OpenMP GPU Offloading

Abstract

Talk to us

Similar Papers