HePREM: A Predictable Execution Model for GPU-based Heterogeneous SoCs

Bjorn Forsberg,Luca Benini,Andrea Marongiu

doi:10.1109/tc.2020.2980520

Abstract

The ever-increasing need for computational power in embedded devices has led to the adoption heterogeneous SoCs combining a general purpose CPU with a data parallel accelerator. These systems rely on a shared main memory (DRAM), which makes them highly susceptible to memory interference. A promising software technique to counter such effects is the Predictable Execution Model (PREM). PREM ensures robustness to interference by separating programs into a sequence of memory and compute phases, and by enforcing a platform-level schedule where only a single processing subsystem is permitted to execute a memory phase at a time. This article demonstrates for the first time how PREM can be applied to heterogeneous SoCs, based on a synchronization technique for memory isolation between CPU and GPU plus a compiler to transform GPU kernels into PREM-compliant codes. For compute bound GPU workloads sharing the DRAM bandwidth 50/50 with the CPU we guarantee near-zero timing varibility at a performance loss of just 59 percent, which is one to two orders of magnitude smaller than the worst case we see for unmodified programs under memory interference.

Full Text