Scheduling Executors with Time-varying Resource Demands on Data-Parallel Computation Frameworks

Hu Yang,Su Lin

doi:10.12783/dtssehs/asshm2016/8355

Abstract

Efficiently scheduling execution instances of data-parallel computing frameworks, such as Spark and Dryad, on a multi-tenant environment is critical to applications’ performance and systems’ utilization. To this end, one has to avoid resource fragmentation and over-allocation so that both idleness and contention of resources can be minimized. To make effective scheduling decisions, a scheduler has to be informed of and exploit resource demands of individual execution instances, including both short-lived tasks and long-lived executors. The issue becomes particularly challenging when resource demands greatly vary over time within each instance. Prior studies often assume that a scheduling instance is either short lived or of gradually varying resource demands. However, when in-memory computing platforms, such as Spark, become increasingly popular, the assumption no longer holds. The execution instance for scheduling becomes executor, which executes an entire application once it is scheduled. Usually it is not short lived. Its resource demands are significantly time-varying. To address the inefficacy of current cluster schedulers, we propose a scheduling approach, namely Prophet, which takes resource demand variation within each executor into the scheduling decision. It leverages the fact that execution of a data-parallel application is pre-defined by a DAG structure and resource demands at various DAG stages are highly predictable. With this knowledge, Prophet schedules executors to minimize resource fragmentation and over-allocation. To deal with unexpected re-source contention, Prophet adaptively backs off selected task(s) to reduce the contention. We have implemented Prophet in Apache Yarn running Spark. We evaluated it on a 16-server cluster, using 10 categories of a total of 90 application benchmarks. Compared to Yarn’s default capacity and fair schedulers, Prophet reduces application make span by up to 39% and reduces their median completion time by 23%.

Full Text