Abstract
This paper describes the design of an automatic parallelization framework. The kernel supplied at its front end was suggested as an instrument for parallel potential assessment. It was used to measure the maximum achievable speedups in the major set of the CHStone benchmark suite programs. In such framework, we suggested the liberation of parallelism incrementally. We proposed a data dependency heuristic-based transformation method to make true dependences dissociation. We generated an internal representation ($ IR^{2} $), where the Banerjee test conditions are met. Two among three of Banerjee test conditions came to be committed. In shared memory many/multicore platforms, the third condition could be satisfied by privatization. We would be able to choose the safe and the opportune pairwise (mapping-privatization) scheme among a number of threads mapping scenarios that become available in the $ IR^{2} $ structure. Instrumentation on a subset of CHStone benchmark was carried out as a validity proof of our proposal, and the results confirmed that our framework kernel is robust.
Highlights
Demands for parallelizing frameworks are expected to increase considerably in the near future for two reasons
Parallelization frameworks [9,10,11,12] are source-to-source compilers. They involve mainly a number of modules to cover the generation of an Internal Representation (IR) after parsing, modules for profiling, analysis engines, modules for transformation passes, and rolling-back parsing modules to generate output sources
The Map is a set of directed graphs that express the program outputs and the whole processing involved for their production
Summary
Demands for parallelizing frameworks are expected to increase considerably in the near future for two reasons. Full automatic parallelization tools are supposed to be reliable instruments for parallel implementations Their popularity is not at the desired level. Data dependence profiling has not been addressed exclusively for automatic parallelization purposes It has been investigated in a wide context of compilation optimizations. It has been addressed to deal with a number of optimizations such as partial redundancy elimination (PRE), runtime code scheduling [14], and performance tuning [16] Integration of such profilers in autoparallelization tools seems to be challenging since some of them, the dynamic data-dependence profilers [15,16,17], suffer from runtime overhead and memory overhead. In the mainstream parallelization techniques when we suggest the pairwise mapping, Thread–iteration-loop, the problem of privatizing data arises. We instrument a subset of CHstone benchmark suite as a validity proof of our proposals and we expose the results as approximate estimations of the theoretical speedups
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: TURKISH JOURNAL OF ELECTRICAL ENGINEERING & COMPUTER SCIENCES
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.