Straightforward Heterogeneous Computing with the oneAPI Coexecutor Runtime

Raúl Nozal,Jose Luis Bosque

doi:10.3390/electronics10192386

Abstract

Heterogeneous systems are the core architecture of most computing systems, from high-performance computing nodes to embedded devices, due to their excellent performance and energy efficiency. Efficiently programming these systems has become a major challenge due to the complexity of their architectures and the efforts required to provide them with co-execution capabilities that can fully exploit the applications. There are many proposals to simplify the programming and management of acceleration devices and multi-core CPUs. However, in many cases, portability and ease of use compromise the efficiency of different devices—even more so when co-executing. Intel oneAPI, a new and powerful standards-based unified programming model, built on top of SYCL, addresses these issues. In this paper, oneAPI is provided with co-execution strategies to run the same kernel between different devices, enabling the exploitation of static and dynamic policies. This work evaluates the performance and energy efficiency for a well-known set of regular and irregular HPC benchmarks, using two heterogeneous systems composed of an integrated GPU and CPU. Static and dynamic load balancers are integrated and evaluated, highlighting single and co-execution strategies and the most significant key points of this promising technology. Experimental results show that co-execution is worthwhile when using dynamic algorithms and improves the efficiency even further when using unified shared memory.

Highlights

In recent years, with the quest to constantly improve the performance and energy efficiency of computing systems, together with the diversity of architectures and computing devices, it has become possible to exploit an interesting variety of problems due to heterogeneous systems
We propose to provide oneAPI with mechanisms that allow the implementation of co-execution without additional effort for the programmer
The proposed Coexecutor Runtime is built on top of oneAPI as a runtime library to allow the parallel exploitation of the CPU along with multiple hardware accelerators that facilitate the implementation of workload balancing algorithms

Summary

Introduction

With the quest to constantly improve the performance and energy efficiency of computing systems, together with the diversity of architectures and computing devices, it has become possible to exploit an interesting variety of problems due to heterogeneous systems. The oneAPI’s cross-architecture language Data Parallel C++ (DPC++) [25], based on SYCL standard for heterogeneous programming in C++, provides a single, unified open development model for productive heterogeneous programming and cross-vendor support. It allows code reuse across hardware targets while permitting custom tuning for a specific accelerator. This article addresses a new challenge in improving the usability and exploitation of heterogeneous systems, providing oneAPI with the capacity for co-execution This is defined as the collaboration of all the devices in the system (including the CPU) to execute a single massively data-parallel kernel [14,26,27,28].

Background

Platform Model

Execution Model

Memory Model

Kernel Programming Model

Motivation

OneAPI Coexecutor Runtime

Static Co-Execution

Dynamic Co-Execution

Load Balancing Algorithms

API Design

Validation

Performance

Scalability

Energy

NBoby Benchmark

Related Work

Findings

Conclusions

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Electronics	Publication Date: Sep 29, 2021
Citations: 6	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Straightforward Heterogeneous Computing with the oneAPI Coexecutor Runtime

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Electronics

Lead the way for us

Similar Papers

A Comparison of Static and Dynamic Load Balancing
Hisao Kameda ... Jie Li
-
Hisao Kameda, et. al.Hisao Kameda ... Jie Li
01 Jan 1997
01 Jan 1997

A performance comparison of dynamic vs. static load balancing policies in a mainframe-personal computer network model
H Kameda ... Jie Li
-
H Kameda, et. al.H Kameda ... Jie Li
12 Dec 2000
12 Dec 2000

Exploiting Co-execution with OneAPI: Heterogeneity from a Modern Perspective
Raúl Nozal ... Jose Luis Bosque
-
Raúl Nozal, et. al.Raúl Nozal ... Jose Luis Bosque
01 Jan 2020
01 Jan 2020

A Survey of Dynamic Load Balancing
Hisao Kameda ... Chonggun Kim
-
Hisao Kameda, et. al.Hisao Kameda ... Chonggun Kim
01 Jan 1997
01 Jan 1997

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Straightforward Heterogeneous Computing with the oneAPI Coexecutor Runtime

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Electronics