HPC Workflow on Diverse XPU Architectures with oneAPI

Mandeep Kumar,Gagandeep Kaur

doi:10.1109/conit55038.2022.9848296

Abstract

High Performance Computing (HPC) workloads necessitate a variety of hardware, including scalar, vector, matrix, and spatial architectures. The biggest challenge came from a programming standpoint, because writing and deploying code for the Central Processing Unit (CPU) and accelerators like Field Programmable Gate Array (FPGA) as well as Graphics Processing Unit (GPU) has traditionally necessitated a variety of languages, libraries, and tools. oneAPI, a unified programming model that addresses this issue and makes development easier across diverse XPU architectures, including CPUs, GPUs, and FPGAs. Data Parallel C++ (DPC++) is used to write oneAPI programs. It incorporates the SYCL standard for data parallelism and heterogeneous programming, as well as modern C++ productivity benefits and familiar constructs. It is a single-source programming language that allows host code and heterogeneous accelerator kernels to coexist in the same source files. This work presents an HPC workflow across diverse XPU architectures, including CPUs, GPUs, and FPGAs, using oneAPI. We glance at DPC++ and OpenMP with Message Passing Interface (MPI) on Multi-Node for CPU and GPU combinations. In addition, we examine oneAPI offload advisor and DPC++ GPU profiling.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

HPC Workflow on Diverse XPU Architectures with oneAPI

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Parallel hyperbolic PDE simulation on clusters: Cell versus GPU
Scott Rostrup ... Hans De Sterck
Computer Physics Communications | VOL. 181
Scott Rostrup, et. al.Scott Rostrup ... Hans De Sterck
26 Aug 2010
Computer Physics Communications | VOL. 181

Accelerating Molecular Docking by Parallelized Heterogeneous Computing - A Case Study of Performance, Quality of Results, and Energy-Efficiency using CPUs, GPUs, and FPGAs

-

30 Nov 2019
30 Nov 2019

FPGA, GPU, and CPU implementations of Jacobi algorithm for eigenanalysis
Mustafa U Torun ... Ali N Akansu
Journal of Parallel and Distributed Computing | VOL. 96
Mustafa U Torun, et. al.Mustafa U Torun ... Ali N Akansu
31 May 2016
Journal of Parallel and Distributed Computing | VOL. 96

High performance CCSDS image data compression using GPGPUs for space applications
...
-
, et. al. ...
01 Jan 2015
01 Jan 2015

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

HPC Workflow on Diverse XPU Architectures with oneAPI

Abstract

Talk to us

Similar Papers