Abstract

The need for parallel task execution has been steadily growing in recent years since manufacturers mainly improve processor performance by increasing the number of installed cores instead of scaling the processor’s frequency. To make use of this potential, an essential technique to increase the parallelism of a program is to parallelize loops. Several automatic loop nest parallelizers have been developed in the past such as PluTo. The main restriction of these tools is that the loops must be statically analyzable which, among other things, disallows function calls within the loops. In this article, we present a seemingly simple extension to the C programming language which marks functions without side-effects. These functions can then basically be ignored when the automatic parallelizer checks the parallelizability of loops. We integrated the approach into the GCC compiler toolchain and evaluated it by running several real-world applications. Our experiments show that the C extension helps to identify additional parallelization opportunities and, thus, to significantly increase the performance of applications.

Highlights

  • Processor vendors cannot longer cost-effectively improve performance by scaling processor frequencies [1]

  • The performance boost is higher for smaller core numbers, while the performance of pure together with the Intel C/C++ Compiler (ICC) compiler converges to the performance of the GCC compiler chain for core counts higher than 16. This automatic vectorization is not carried out when the function is inlined, for which reason PluTo and PluTo-SICA cannot benefit from the ICC compiler for smaller core counts

  • The sequential GCC version leads to a runtime of 34.14 s, while the version generated by the Intel ICC compiler requires 31.32 s

Read more

Summary

Introduction

Processor vendors cannot longer cost-effectively improve performance by scaling processor frequencies [1]. Threads and vector units are typically insufficiently used as programmers need a deep understanding of these libraries and of parallelism in general to efficiently develop parallel applications To solve this problem, several research projects developed automatic parallelization tools that transparently transform sequential source code into parallel code. Fortran introduced the keyword pure to mark side-effect-free functions and checks if they are really side-effect-free This makes it possible to parallelize more code segments automatically. By allowing pure function calls in polyhedral program loops, these loops—which have not been automatically parallelizable previously—can transparently be parallelized in the compilation process. The pure keyword can be used in libraries to mark functions as side-effect-free This has the effect that even library function calls can be used in automatically parallelized program parts.

Fortran and C Language Extensions
Parallelization Tools
Language Extension
Compiler Pass
Automatic Parallelization
Limitations
Evaluation
Test Applications
Test Environment
Scaling Tests
Matrix–Matrix Multiplication
Heat Distribution
Satellite
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.