Abstract
SUMMARYTask parallelism is a programming technique that has been shown to be applicable in a wide variety of problem domains. A central parameter that needs to be controlled to ensure efficient execution of task parallel programs is the granularity of tasks. When they are too coarse grained, scalability and load balance suffer, while very fine‐grained tasks introduce execution overheads. We present a combined compiler and runtime approach that enables automatic granularity control. Starting from recursive, task parallel programs, our compiler generates multiple versions of each task, increasing granularity by task unrolling. Subsequently, we apply a parallelism‐aware optimizing transformation to remove superfluous task synchronization primitives in all generated versions. A runtime system then selects among these task versions of varying granularity by locally tracking task demand. Benchmarking on a set of task parallel programs using a work‐stealing scheduler demonstrates that our approach is generally effective. For fine‐grained tasks, we can achieve reductions in execution time exceeding a factor of 6, compared with state‐of‐the‐art implementations. Additionally, we evaluate the impact of two crucial algorithmic parameters, the number of generated code versions and the task queue length, on the performance of our method. Copyright © 2014 John Wiley & Sons, Ltd.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: Concurrency and Computation: Practice and Experience
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.