Coflow scheduling is crucial for enhancing application-level communication performance in data-parallel clusters. While schemes like Varys can potentially achieve optimal performance, their dependence on a prior information about coflows poses practical challenges. Existing non-clairvoyant solutions, such as Aalo, approximate the classical online Shortest-Job-First (SJF) scheduling but fail to identify bottleneck flows in coflows. Consequently, they often allocate excessive bandwidth to non-bottleneck flows, leading to bandwidth wastage and reduced overall performance. In this paper, we introduce MSDQ, a coflow scheduling mechanism that operates without prior knowledge, utilizing multi-scheduling dual-priority queues, and using width estimates. This method adjusts coflow queue priorities and scheduling sequences based on the coflow’s width and the volume of data transmitted. By reallocating unused network bandwidth at multiple points during the scheduling process, MSDQ maximizes the bandwidth usage and significantly reduces the average coflow completion time. Our evaluation, using a publicly available production cluster trace from Facebook, demonstrates that MSDQ reduces the average coflow completion time by 1.42× compared to Aalo.
Read full abstract