Towards Optimality in Parallel Job Scheduling

Benjamin Berg,Mor Harchol-Balter,Jan-Pieter Dorsman

doi:10.1145/3292040.3219666

Abstract

To keep pace with Moore's law, chip designers have focused on increasing the number of cores per chip. To effectively leverage these multi-core chips, one must decide how many cores to assign to each job. Given that jobs receive sublinear speedups from additional cores, there is a tradeoff: allocating more cores to an individual job reduces the job's runtime, but decreases the efficiency of the overall system. We ask how the system should assign cores to jobs so as to minimize the mean response time over a stream of incoming jobs. To answer this question, we develop an analytical model of jobs running on a multi-core machine. We prove that EQUI, a policy which continuously divides cores evenly across jobs, is optimal when all jobs follow a single speedup curve and have exponentially distributed sizes. We also consider a class of "fixed-width" policies, which choose a single level of parallelization, k, to use for all jobs. We prove that, surprisingly, fixed-width policies which use the optimal fixed level of parallelization, k*, become near-optimal as the number of cores becomes large. In the case where jobs may follow different speedup curves, finding a good scheduling policy is even more challenging. In particular, EQUI is no longer optimal, but a very simple policy, GREEDY*, performs well empirically.

Full Text