Optimization of Data Assignment for Parallel Processing in a Hybrid Heterogeneous Environment Using Integer Linear Programming

Tomasz Boiński,Paweł Czarnul

doi:10.1093/comjnl/bxaa187

Tomasz Boiński, Paweł Czarnul

Open Access

https://doi.org/10.1093/comjnl/bxaa187

Copy DOI

Journal: The Computer Journal	Publication Date: Feb 10, 2021
Citations: 2	License type: CC BY-NC 4.0

Affiliation: Gdańsk University of Technology

Abstract

Abstract In the paper we investigate a practical approach to application of integer linear programming for optimization of data assignment to compute units in a multi-level heterogeneous environment with various compute devices, including CPUs, GPUs and Intel Xeon Phis. The model considers an application that processes a large number of data chunks in parallel on various compute units and takes into account computations, communication including bandwidths and latencies, partitioning, merging, initialization, overhead for computational kernel launch and cleanup. We show that theoretical results from our model are close to real results as differences do not exceed 5% for larger data sizes, with up to 16.7% for smaller data sizes. For an exemplary workload based on solving systems of equations of various sizes with various compute-to-communication ratios we demonstrate that using an integer linear programming solver (lp_solve) with timeouts allows to obtain significantly better total (solver+application) run times than runs without timeouts, also significantly better than arbitrary chosen ones. We show that OpenCL 1.2’s device fission allows to obtain better performance in heterogeneous CPU+GPU environments compared to the GPU-only and the default CPU+GPU configuration, where a whole device is assigned for computations leaving no resources for GPU management.

Full Text