Chapter 6 - GPU programming with CUDA

Peter S Pacheco,Matthew Malensek

doi:10.1016/b978-0-12-804605-0.00013-0

Abstract

Graphics processing units or GPUs are being widely used for general purpose programming. CUDA provides a collection of modifications to the C++ compiler and a library of functions that can be can be used for general purpose programming of Nvidia GPUs. GPUs employ features of both SIMD and MIMD processors. GPUs are not ordinarily standalone processors; rather a typical GPU is paired with a CPU, which carries out basic functions such as I/O, memory allocation, and initialization. We develop a range of CUDA programs from a basic “hello, world” program to programs for numerical integration to programs for sorting. We illustrate the basics of writing CUDA kernels, which are functions that run on the GPU but are started by the host processor—the CPU. We make use of CUDA global and shared memory, barriers, and warp shuffles to accelerate the performance of our programs and to address race conditions.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Chapter 6 - GPU programming with CUDA

Abstract

Talk to us

Similar Papers

More From: An Introduction to Parallel Programming

Lead the way for us

Journal: An Introduction to Parallel Programming	Publication Date: Sep 3, 2021
Citations: 4

Similar Papers

Ballooning Graphics Memory Space in Full GPU Virtualization Environments
Younghun Park ... Sungyong Park
Scientific Programming | VOL. 2019
Younghun Park, et. al.Younghun Park ... Sungyong Park
23 Apr 2019
Scientific Programming | VOL. 2019

HAccRG: Hardware-Accelerated Data Race Detection in GPUs
Anup Holey ... Vineeth Mekkat
-
Anup Holey, et. al.Anup Holey ... Vineeth Mekkat
01 Oct 2013
01 Oct 2013

GPU를 이용한 Gabor Texture 특징점 기반의 금속 패드 변색 분류 알고리즘
...
Journal of Institute of Control, Robotics and Systems | VOL. 15
, et. al. ...
01 Aug 2009
Journal of Institute of Control, Robotics and Systems | VOL. 15

Using shared memory as a cache in high performance cellular automata water flow simulations
Topa Pawel ... Mlocek Pawel
Computer Science | VOL. 14
Topa Pawel, et. al.Topa Pawel ... Mlocek Pawel
01 Jan 2013
Computer Science | VOL. 14

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Chapter 6 - GPU programming with CUDA

Abstract

Talk to us

Similar Papers

More From: An Introduction to Parallel Programming