Abstract

Heterogeneous cluster systems consisting of CPUs and different kinds of accelerators have become mainstream in HPC. Programming such systems is a difficult task and requires addressing manifold challenges that stem from the intricate composition of such systems and peculiarities of scientific applications. A broad range of obstacles preventing efficient execution have to be considered and dealt with properly. In this paper, we propose a systematic approach and a framework that is capable of providing comprehensive support for running data-parallel applications in heterogeneous asymmetric clusters. Our implementation provides work partitioning and distribution by ensuring workload balance in the cluster while handling of partitioning-induced communication and synchronization in a transparent way. In our experimental section, we choose 11 representative scientific applications from different domains to evaluate our approach. Experimental results show a strong speedup and workload balance for different cluster configurations.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call