Abstract

In distributed computing frameworks like MapReduce, Spark, and Dyrad, a coflow is a set of flows transferring data between two stages of a job. The job cannot start its next stage unless all flows in the coflow finish. To improve the execution performance of such a job, it is crucial to reduce the completion time of a coflow, as it can contribute more than 50 percent of the job completion time. While several coflow schedulers have been proposed, we observe that routing, as a factor greatly impacting the Coflow Completion Time (CCT), has not been well considered. In this article, we focus on the coflow scheduling problem and jointly consider routing and bandwidth allocation. We begin by providing an analytical solution to the problem of optimal bandwidth allocation with pre-determined routes. In the following, we formulate the problem of scheduling a single coflow as a Non-linear Mixed Integer Programming problem and present its relaxed convex optimization problem. We further propose two algorithms, CoRBA and its simplified version: CoRBA-fast that solve the single coflow scheduling problem with a joint consideration of routing and bandwidth allocation. Lastly, to address multiple coflows in online scheduling, we propose an online scheduler named OnCoRBA. By comparing with the start-of-the-art algorithms and schedulers via simulations, we demonstrate that CoRBA and CoRBA-fast reduce the CCT by 30-400 percent and the OnCoRBA scheduler reduces the average online CCT by 20-230 percent. In addition, CoRBA-fast can be hundreds times faster than CoRBA with around 8 percent performance degradation compared to CoRBA, which makes the use of CoRBA-fast very appropriate in practice.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call