On Effective Scheduling in Computing Clusters

D A Grushin,N N Kuzyurin

doi:10.1134/s0361768819070077

Abstract

Presently, big companies such as Amazon, Google, Facebook, Microsoft, and Yahoo! own huge datacenters with thousands of nodes. These clusters are used simultaneously by many clients. The users submit jobs containing one or more tasks. The task flow is usually a mix of short, long, interactive, and batch tasks with different priorities. The cluster scheduler decides on which server the task should be run as a process, container, or virtual machine. Scheduler optimizations are important as they provide higher server utilization, lower latency, improved load balancing, and fault tolerance. Optimal task placement is a complex problem that has multiple dimensions and requires algorithmically complex optimizations. This increases placement latency and limits cluster scalability. In this paper, we consider different cluster scheduler architectures and optimization problems.

Full Text