Abstract

We consider a natural scheduling problem which arises in many distributed computing frameworks. Jobs with diverse resource demands (e.g. memory requirements) arrive over time and must be served by a cluster of servers. To improve throughput and delay, the scheduler can pack as many jobs as possible in each server, however the sum of the jobs' resource demands cannot exceed the server's capacity. Motivated by the increasing complexity of workloads in shared clusters, we consider a setting where jobs' resource demands belong to a very large set of diverse types, or in the extreme case even infinitely many types, i.e. resource demands are drawn from a general unknown distribution over a possibly continuous support. The application of classical scheduling approaches that crucially rely on a predefined finite set of types is discouraging in this high (or infinite) type setting. We first characterize a fundamental limit on the maximum throughput in such setting. We then develop oblivious scheduling algorithms, based on Best-Fit and Universal Partitioning, that have low complexity and can achieve at least 1/2 and 2/3 of the maximum throughput respectively, without the knowledge of the resource demand distribution. Extensive simulation results, using both synthetic and real traffic traces, are presented to verify the performance of our algorithms.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call