Abstract
This chapter presents solution for design and implementation data‐parallel scientific applications for such highly heterogeneous and hierarchical platforms i.e. High‐performance computing (HPC) platforms, based on the functional performance models of computing devices and nodes. It reviews related work and conclude that data partitioning algorithms based on the functional performance models (FPMs) are more suited for balancing data‐parallel scientific applications on heterogeneous platforms. The main contribution of work is the adaptation of the FPM‐based data partitioning to hybrid CPU/graphics processing units (GPU) nodes and clusters. The chapter demonstrates how to design a scientific application to make use of FPM‐based data partitioning on a heterogeneous hierarchical platform. More specifically, it uses the well‐known parallel matrix multiplication application. The chapter provides an efficient method that builds the models to a sufficient level of accuracy in the relevant range of problem sizes (partial FPM). Partial FPMs were originally designed for heterogeneous uniprocessors.
Paper version not known (Free)
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have