Autonomic scheduling of tasks from data parallel patterns to CPU/GPU core mixes

T Serban,P Kilpatrick,M Danelutto

doi:10.1109/hpcsim.2013.6641395

Abstract

We propose a methodology for optimizing the execution of data parallel (sub-)tasks on CPU and GPU cores of the same heterogeneous architecture. The methodology is based on two main components: i) an analytical performance model for scheduling tasks among CPU and GPU cores, such that the global execution time of the overall data parallel pattern is optimized; and ii) an autonomic module which uses the analytical performance model to implement the data parallel computations in a completely autonomic way, requiring no programmer intervention to optimize the computation across CPU and GPU cores. The analytical performance model uses a small set of simple parameters to devise a partitioning-between CPU and GPU cores-of the tasks derived from structured data parallel patterns/algorithmic skeletons. The model takes into account both hardware related and application dependent parameters. It computes the percentage of tasks to be executed on CPU and GPU cores such that both kinds of cores are exploited and performance figures are optimized. The autonomic module, implemented in FastFlow, executes a generic map (reduce) data parallel pattern scheduling part of the tasks to the GPU and part to CPU cores so as to achieve optimal execution time. Experimental results on state-of-the-art CPU/GPU architectures are shown that assess both performance model properties and autonomic module effectiveness.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Autonomic scheduling of tasks from data parallel patterns to CPU/GPU core mixes

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Adaptive virtual channel partitioning for network-on-chip in heterogeneous architectures
Jaekyu Lee ... Hyesoon Kim
ACM Transactions on Design Automation of Electronic Systems | VOL. 18
Jaekyu Lee, et. al.Jaekyu Lee ... Hyesoon Kim
01 Oct 2013
ACM Transactions on Design Automation of Electronic Systems | VOL. 18

Bridging Performance Analysis Tools and Analytic Performance Modeling for HPC
Torsten Hoefler
-
Torsten HoeflerTorsten Hoefler
01 Jan 2010
01 Jan 2010

WE‐G‐110‐07: Hybridmantis: A Novel Method for Faster Monte Carlo Simulation of X‐Ray Imaging Detectors
D Sharma ... A Badano
Medical Physics | VOL. 38
D Sharma, et. al.D Sharma ... A Badano
01 Jun 2011
WE‐G‐110‐07: Hybridmantis: A Novel Method for Faster Monte Carlo Simulation of X‐Ray Imaging Detectors
D Sharma ... A Badano

Research on Cache Partitioning and Adaptive Replacement Policy for CPU-GPU Heterogeneous Processors
Juan Fang ... Shijian Liu
-
Juan Fang, et. al.Juan Fang ... Shijian Liu
01 Oct 2017
01 Oct 2017

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Autonomic scheduling of tasks from data parallel patterns to CPU/GPU core mixes

Abstract

Talk to us

Similar Papers