Abstract

At the Petascale level of performance, High-Performance Computing (HPC) systems require significant use of supercomputers with the extensive parallel programming approaches to solve the complicated computational tasks. The Exascale level of performance having 1018 calculations per second is another remarkable achievement in computing with a fathomless influence on everyday life. The current technologies are facing various challenges while achieving ExaFlop performance through energy-efficient systems. Massive parallelism and power consumption are vital challenges for achieving ExaFlop performance. In this paper, we have introduced a novel parallel programming model that provides massive performance under power consumption limitations by parallelizing data on the heterogeneous system to provide coarse grain and fine-grain parallelism. The proposed dual-hierarchical architecture is a hybrid of MVAPICH2 and CUDA, called the M2C model, for heterogeneous systems that utilize both CPU and GPU devices for providing massive parallelism. To validate the objectives of the current study, the proposed model has been implemented using bench-marking applications including linear Dense Matrix Multiplication. Furthermore, we conducted a comparative analysis of the proposed model by existing state-of-the-art models and libraries such as MOC, kBLAS, and cuBLAS. The suggested model outperforms existing models while achieving massive performance in HPC clusters and can be considered for emerging Exascale computing systems.

Highlights

  • The High-Performance Computing (HPC) can practice computing power and parallel processing techniques to resolve extensive enigmas in various disciplines, e.g., medicine, engineering, commercial enterprise, smart cities [51] and so on, by providing much higher performance than that of a traditional desktop computer [1, 2]

  • An HPC system usually consists of a community of CPUs where each processor contains multi-cores along with its local memory to execute a variety of complicated tasks [38]

  • We have proposed a novel hybrid parallel programming for parallel computing, which is called as MVAPICH2+ Compute Unified Device Architecture (CUDA) (M2C) model

Read more

Summary

Introduction

The High-Performance Computing (HPC) can practice computing power and parallel processing techniques to resolve extensive enigmas in various disciplines, e.g., medicine, engineering, commercial enterprise, smart cities [51] and so on, by providing much higher performance than that of a traditional desktop computer [1, 2]. An HPC system usually consists of a community of CPUs where each processor contains multi-cores along with its local memory to execute a variety of complicated tasks [38]. HPC systems utilize supercomputers and parallel processing techniques to execute extensive jobs. Large problems are broken down into smaller ones that could be resolved at the same time to enhance the overall performance of HPC systems by parallel computing. If a traditional desktop computer takes 200 hours to complete a specific task while it could be completed in 1 hour by utilizing 200 computers at once by the HPC system. A single desktop computer might not be as useful as it could be while utilizing all the resources collectively as a community

Objectives
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.