Abstract
OpenMP, a typical shared memory programming paradigm, has been extensively applied in high performance computing community due to the popularity of multicore architectures in recent years. The most significant feature of the OpenMP 3.0 specification is the introduction of the task constructs to express parallelism at a much finer level of detail. This feature, however, has posed new challenges for performance monitoring and analysis. In particular, task creation is separated from its execution, causing the traditional monitoring methods to be ineffective. This paper presents a mechanism to monitor task-based OpenMP programs with interposition and proposes two demonstration graphs for performance analysis as well. The results of two experiments are discussed to evaluate the overhead of monitoring mechanism and to verify the effects of demonstration graphs using the BOTS benchmarks.
Highlights
Nowadays, multicore CPU design has been widely adopted in current supercomputers and the performance of these systems depends on the processor frequency, and the number of cores
We propose a new method in this study to monitor and analyze task-based OpenMP applications with tied mode, addressing the question how performance tools can include this new dimension in OpenMP paradigm and provide all necessary data to performance analysts so that the application performance could be optimized
Based on the monitoring mechanism and the demonstration graphs described above, we implemented a prototype monitoring library and some demonstration modules which were integrated into the framework of PAPMAS (Parallel Application Performance Monitoring and Analysis System) [16,17]
Summary
Multicore CPU design has been widely adopted in current supercomputers and the performance of these systems depends on the processor frequency, and the number of cores. OpenMP, an API regarded as the de facto standard for multithreaded shared-memory programming, is well suited for current multicore architecture, providing both directives of OpenMP constructs such as parallel regions, sections, and single, etc., and API functions to parallelize the regular structures in version 2.5 [1]. The irregular and dynamic structures like recursive routines widely used in current real programs are not well supported in version 2.5 of OpenMP. The task construct in this version is represented by two separate entities: one for task creation and the other for task execution. These entities could be performed in two different dimensions: time and space
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.