Abstract

SUMMARYTask parallelism is a programming technique that has been shown to be applicable in a wide variety of problem domains. A central parameter that needs to be controlled to ensure efficient execution of task parallel programs is the granularity of tasks. When they are too coarse grained, scalability and load balance suffer, while very fine‐grained tasks introduce execution overheads. We present a combined compiler and runtime approach that enables automatic granularity control. Starting from recursive, task parallel programs, our compiler generates multiple versions of each task, increasing granularity by task unrolling. Subsequently, we apply a parallelism‐aware optimizing transformation to remove superfluous task synchronization primitives in all generated versions. A runtime system then selects among these task versions of varying granularity by locally tracking task demand. Benchmarking on a set of task parallel programs using a work‐stealing scheduler demonstrates that our approach is generally effective. For fine‐grained tasks, we can achieve reductions in execution time exceeding a factor of 6, compared with state‐of‐the‐art implementations. Additionally, we evaluate the impact of two crucial algorithmic parameters, the number of generated code versions and the task queue length, on the performance of our method. Copyright © 2014 John Wiley & Sons, Ltd.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.