Resource scheduling plays a vital role in High-Performance Computing (HPC) systems. Most scheduling research in HPC has focused on only a single type of resource (e.g., computing cores or I/O resource). With the advancement in hardware architectures and the increase in data-intensive HPC applications, there is a need to simultaneously consider a diverse set of resources (e.g., computing cores, cache, memory, I/O, and network resources) in the design of runtime schedulers for improving the overall application performance. In this paper, we study multi-resource scheduling to minimize the makespan of computational workflows comprised of moldable parallel jobs. Moldable jobs allow the scheduler to flexibly select a variable set of resources before execution, thus can adapt to the available system resources (as compared to rigid jobs) while staying easy to design and implement (as compared to malleable jobs). We propose a Multi-Resource Scheduling Algorithm (MRSA), which combines a novel resource allocation strategy and an extended list scheduling scheme to schedule the jobs. We prove that, on a system with d types of schedulable resources, MRSA achieves an approximation ratio of 1.619d+2.545d+1 for any d≥1, and a ratio of d+3d23+O(d3) when d is large (i.e., d≥22). We also present improved approximation results for workflows comprised of jobs with special precedence constraints (e.g., series-parallel graphs, trees, and independent jobs). Further, we prove a lower bound of d on the approximation ratio of any list-based scheduling algorithm with local priority considerations. Finally, we conduct a comprehensive set of simulations to evaluate the performance of the algorithm using synthetic workflows of different structures and moldable jobs following different speedup models. The results show that MRSA fares better than the theoretical bound predicts, and that it consistently outperforms two baseline heuristics under a variety of parameter settings, illustrating its robust practical performance.
Read full abstract