Abstract

Scientific applications running on heterogeneous computing systems, which often have unpredictable behavior, enhance their performance by employing loop scheduling techniques as methods to avoid load imbalance through an optimized assignment of their parallel loops. With current computing platforms facilitating petascale performance and promising exascale performance towards the end of the present decade, efficient and robust algorithms are required to guarantee optimal performance of parallel applications in the presence of unpredictable perturbations. A number of dynamic loop scheduling (DLS) methods based on probabilistic analyses have been developed to achieve the desired robust performance. In earlier work, two metrics (flexibility and resilience) have been formulated to quantify the robustness of various DLS methods in heterogeneous computing systems with uncertainties. In this work, to ensure robust performance of the scientific applications on current (petascale) and future(exascale) high performance computing systems, a simulation model was designed and integrated into the SimGrid simulation toolkit, thus enabling a comprehensive study of the robustness of the DLS methods which uses results of experimental cases with various combinations of number of processors, problem sizes, and scheduling methods. The DLS methods have been implemented into the simulation model and analyzed for the purpose of exploring their flexibility (robustness against unpredictable variations in the system load), when involved in a range of case scenarios comprised of various distributions characterizing loop iteration execution times and system availability. The simulation results reported are used to compare the robustness of the DLS methods under the various environments considered, using the flexibility metric.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call