Abstract

In the context of HPC platforms, individual nodes nowadays consist of heterogenous processing resources such as GPU units and multicores. Those resources share communication and storage resources, inducing complex co-scheduling effects, and making it hard to predict the exact duration of a task or of a communication. To cope with these issues, runtime dynamic schedulers such as starpu have been developed. These systems base their decisions at runtime on the state of the platform and possibly on static priorities of tasks computed offline. In this paper, our goal is to quantify performance variability in the context of HPC heterogeneous nodes, by focusing on very regular dense linear algebra kernels, such as Cholesky and LU factorizations. We therefore first concentrate on the evaluation of the individual block-size kernels variability. Then, we analyze the impact of this variability at the scale of a full application on a dynamic runtime scheduler such as starpu, in order to analyze whether the strategies that have been designed in the context of MapReduce applications to cope with stragglers could be transferred to HPC systems, or if the dynamic nature of runtime schedulers is enough to cope with actual performance variations, even in presence of task dependencies.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.