Abstract

The need for parallel task execution has been steadily growing in recent years since manufacturers mainly improve processor performance by scaling the number of installed cores instead of the frequency of processors. To make use of this potential, an essential technique to increase the parallelism of a program is to parallelize loops. However, a main restriction of available tools for automatic loop parallelization is that the loops often have to be 'polyhedral' and that it is, e.g., not allowed to call functions from within the loops.In this paper, we present a seemingly simple extension to the C programming language which marks functions without side-effects. These functions can then basically be ignored when checking the parallelization opportunities for polyhedral loops. We extended the GCC compiler toolchain accordingly and evaluated several real-world applications showing that our extension helps to identify additional parallelization chances and, thus, to significantly enhance the performance of applications.

Highlights

  • Processor vendors cannot longer cost-effectively improve performance by scaling processor frequencies [1]

  • The performance boost is higher for smaller core numbers, while the performance of pure together with the Intel C/C++ Compiler (ICC) compiler converges to the performance of the GCC compiler chain for core counts higher than 16. This automatic vectorization is not carried out when the function is inlined, for which reason PluTo and PluTo-SICA cannot benefit from the ICC compiler for smaller core counts

  • The sequential GCC version leads to a runtime of 34.14 s, while the version generated by the Intel ICC compiler requires 31.32 s

Read more

Summary

Introduction

Processor vendors cannot longer cost-effectively improve performance by scaling processor frequencies [1]. Threads and vector units are typically insufficiently used as programmers need a deep understanding of these libraries and of parallelism in general to efficiently develop parallel applications To solve this problem, several research projects developed automatic parallelization tools that transparently transform sequential source code into parallel code. Fortran introduced the keyword pure to mark side-effect-free functions and checks if they are really side-effect-free This makes it possible to parallelize more code segments automatically. By allowing pure function calls in polyhedral program loops, these loops—which have not been automatically parallelizable previously—can transparently be parallelized in the compilation process. The pure keyword can be used in libraries to mark functions as side-effect-free This has the effect that even library function calls can be used in automatically parallelized program parts.

Fortran and C Language Extensions
Parallelization Tools
Language Extension
Compiler Pass
Automatic Parallelization
Limitations
Evaluation
Test Applications
Test Environment
Scaling Tests
Matrix–Matrix Multiplication
Heat Distribution
Satellite
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call