Joint Scheduling of Tasks and Network Flows in Big Data Clusters

Lei Yang,Xuxun Liu,Zhenyu Wang,Jiannong Cao

doi:10.1109/access.2018.2878864

Lei Yang, Xuxun Liu + Show 2 more

Open Access

https://doi.org/10.1109/access.2018.2878864

Copy DOI

Abstract

As an increasing number of big data processing platforms like Hadoop, Spark, and Storm appear and normally share the resources in the data center, it has been important and challenging to schedule various jobs from these platforms onto the underlying data center resources such that the overall job completion time is minimized. To solve the problem, the existing work either focus on the task-level scheduling techniques, such as Quincy and delay scheduling, or focus on the network flow scheduling techniques, such as D3 and preemptive distributed quick. These works deal with the scheduling of tasks and network flows separately and cannot achieve optimal performance. The reason is that the task scheduling without regard of the available network bandwidths may generate the task placement that causes serious network congestions and thus leads to long data transmission time. In this paper, we propose the joint scheduling technique by coordinating the task placement and the scheduling of network flows arising from these tasks. We develop a software-defined network (SDN)-based online scheduling framework which selects the task placement based on the available bandwidth on the SDN switches and at meanwhile optimally allocates the bandwidth to each data flow. Comprehensive trace-driven simulations show that the joint scheduling technique can take full use of the network bandwidth and thus reduce the job completion time by 55% on average compared with the benchmark methods.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: IEEE Access	Publication Date: Jan 1, 2018
Citations: 30	License type: cc-by-nc-nd

R Discovery Prime

R Discovery Prime

Joint Scheduling of Tasks and Network Flows in Big Data Clusters

Abstract

Talk to us

Similar Papers

More From: IEEE Access

Lead the way for us

Similar Papers

Data-Aware Task Allocation for Achieving Low Latency in Collaborative Edge Computing
Yuvraj Sahni ... Lei Yang
IEEE Internet of Things Journal | VOL. 6
Yuvraj Sahni, et. al.Yuvraj Sahni ... Lei Yang
01 Apr 2019
IEEE Internet of Things Journal | VOL. 6

PushBox: Making Use of Every Bit of Time to Accelerate Completion of Data-Parallel Jobs
Chen Tian ... Yi Wang
IEEE Transactions on Parallel and Distributed Systems | VOL. 33
Chen Tian, et. al.Chen Tian ... Yi Wang
01 Dec 2022
IEEE Transactions on Parallel and Distributed Systems | VOL. 33

Towards Intelligent Flow Scheduling in Software Defined Data Center Networking
Tianshu Wang ... Yinjie Lin
-
Tianshu Wang, et. al.Tianshu Wang ... Yinjie Lin
02 Dec 2022
02 Dec 2022

Joint Task and Flow Scheduling for Time-Triggered and Strict-Priority Networks
Anna Arestova ... Reinhard German
-
Anna Arestova, et. al.Anna Arestova ... Reinhard German
20 Feb 2023
20 Feb 2023

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Joint Scheduling of Tasks and Network Flows in Big Data Clusters

Abstract

Talk to us

Similar Papers

More From: IEEE Access