Compiler multiversioning for automatic task granularity control

Peter Thoman,Thomas Fahringer,Herbert Jordan

doi:10.1002/cpe.3302

Abstract

SUMMARYTask parallelism is a programming technique that has been shown to be applicable in a wide variety of problem domains. A central parameter that needs to be controlled to ensure efficient execution of task parallel programs is the granularity of tasks. When they are too coarse grained, scalability and load balance suffer, while very fine‐grained tasks introduce execution overheads. We present a combined compiler and runtime approach that enables automatic granularity control. Starting from recursive, task parallel programs, our compiler generates multiple versions of each task, increasing granularity by task unrolling. Subsequently, we apply a parallelism‐aware optimizing transformation to remove superfluous task synchronization primitives in all generated versions. A runtime system then selects among these task versions of varying granularity by locally tracking task demand. Benchmarking on a set of task parallel programs using a work‐stealing scheduler demonstrates that our approach is generally effective. For fine‐grained tasks, we can achieve reductions in execution time exceeding a factor of 6, compared with state‐of‐the‐art implementations. Additionally, we evaluate the impact of two crucial algorithmic parameters, the number of generated code versions and the task queue length, on the performance of our method. Copyright © 2014 John Wiley & Sons, Ltd.

Full Text