Comparing unified, pinned, and host/device memory allocations for memory‐intensive workloads on Tegra SoC

Jake Choi,Heon Young Yeom,Hojun You,Chongam Kim,Yoonhee Kim

doi:10.1002/cpe.6018

Abstract

SummaryEdge computing focuses on processing near the source of the data. Edge computing devices using the Tegra SoC architecture provide a physically distinct GPU memory architecture. In order to take advantage of this architecture, different modes of memory allocation need to be considered. Different GPU memory allocation techniques yield different results in memory usage and execution times of identical applications on Tegra devices. In this article, we implement several GPU application benchmarks, including our custom CFD code with unified, pinned, and normal host/device memory allocation modes. We evaluate and compare the memory usage and execution time of such workloads on edge computing Tegra system‐on‐chips (SoC) equipped with integrated GPUs using a shared memory architecture, and non‐SoC machines with discrete GPUs equipped with distinct VRAM. We discover that utilizing normal memory allocation methods on SoCs actually use double the required memory because of unnecessary device memory copies, despite being physically shared with host memory. We show that GPU application memory usage can be reduced up to 50%, and that even performance improvements can occur just by replacing normal memory allocation and memory copy methods with managed unified memory or pinned memory allocation.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Comparing unified, pinned, and host/device memory allocations for memory‐intensive workloads on Tegra SoC

Abstract

Talk to us

Similar Papers

More From: Concurrency and Computation: Practice and Experience

Lead the way for us

Journal: Concurrency and Computation: Practice and Experience	Publication Date: Sep 18, 2020
Citations: 7

Similar Papers

A memory allocation and assignment method using multi-way partitioning
Namhoon Kim ... R Peng
-
Namhoon Kim, et. al. Namhoon Kim ... R Peng
30 Nov 2004
30 Nov 2004

Memory-constrained high-order entropy coding by Huffman table sharing and memory allocation
Seung Jun Lee ... Choong Woong Lee
-
Seung Jun Lee, et. al. Seung Jun Lee ... Choong Woong Lee
18 Nov 1996
18 Nov 1996

GShare: A centralized GPU memory management framework to enable GPU memory sharing for containers
Munkyu Lee ... Dimitrios S Nikolopoulos
Future Generation Computer Systems | VOL. 130
Munkyu Lee, et. al.Munkyu Lee ... Dimitrios S Nikolopoulos
29 Dec 2021
Future Generation Computer Systems | VOL. 130

Extending the OpenSHMEM Memory Model to Support User-Defined Spaces
Aaron Welch ... Stephen Poole
-
Aaron Welch, et. al.Aaron Welch ... Stephen Poole
06 Oct 2014
06 Oct 2014

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Comparing unified, pinned, and host/device memory allocations for memory‐intensive workloads on Tegra SoC

Abstract

Talk to us

Similar Papers

More From: Concurrency and Computation: Practice and Experience