Abstract

In data centers, the coflow abstraction is proposed to better express the requirements and communication semantics of a group of parallel flows generated by the jobs of cluster computing frameworks. Knowing the coflow-level information, such as coflow size, previous coflow scheduling proposals improve the performance over flow-level scheduling schemes. Recently, since some information of coflow is difficult to obtain in cloud environments, designing coflow scheduling mechanisms with partial or even without any information attracts much attention. However, existing information-agnostic mechanisms are generally built on the least attained service heuristic algorithm that schedules coflows only according to the sent bytes of different coflows, and they all ignore other useful coflow-level information like width, length, and communication patterns. In this paper, we investigate that the coflow completion time could be further decreased by jointly leveraging multiple coflow-level attributes. Based on this investigation, we present a Multiple-attributes-based Coflow Scheduling (MCS) mechanism to reduce the coflow completion time. In MCS, at the start of a coflow, a shortest and narrowest coflow first algorithm is designed to assign the initial priority based on the coflow width. During the transmission of coflows, based on the sent bytes of coflows, we proposed a double-threshold scheme to adjust the priorities of different classes of coflows according to different thresholds. Accordingly, the optimal thresholds are analyzed by using the M/M/1 queuing model. Testbed evaluations and simulations with production workloads show that MCS outperforms the previous information-agnostic scheduler Aalo, and reduces the completion time of small coflows.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call