Bandwidth-Guaranteed Resource Allocation and Scheduling for Parallel Jobs in Cloud Data Center

Zhen Li,Xiaocheng Liu,Xiaogang Qiu,Bin Chen,Yiping Wang,Qihang Wei,Dandan Ning

doi:10.3390/sym10050134

Abstract

Cloud Computing has emerged as a powerful and promising way for running high performance computing (HPC) jobs. Most HPC jobs are designed under multi-processes paradigm and involve frequent communication and synchronization among parallel processes. However, as the underlying resources of cloud data centers are always shared among multiple tenants, the competition of jobs for limited bandwidth resources lead to unpredictable completion times for jobs in the cloud, which may lead to QoS violation and inefficient utilization of resources when scheduling parallel jobs in the cloud. To tackle the issue, it is essential to provide bandwidth guarantees for parallel jobs running in the cloud. Offering a dedicated virtual cluster (VC) for running applications in the cloud is a popular way to guarantee bandwidth demands. Motivated by these problems, in this paper, we firstly design a time-aware virtual cluster (TVC) request model for parallel jobs and consider how to embed requested TVCs of jobs into cloud efficiently under parallel job scheduling framework. An adaptive bandwidth-aware heuristic algorithm, which is denoted as AdaBa, is proposed to improve the job accept rate by adjusting the priorities of servers to accommodate the VMs of TVC adaptively according to the relative size of requested bandwidth demand. Then, a bandwidth-guaranteed migration and backfilling scheduling algorithm, which is denoted as BgMBF, is designed to schedule parallel jobs and the bandwidth demands are guaranteed by AdaBa. To obtain high job responsiveness performance, a bandwidth-reserved job backfilling strategy is designed when the requested TVC for current scheduled job cannot be allocated in the cloud. The migration cost of BgMBF is also considered and an enhanced version BgMBFSDF is then proposed to minimize the number of migration when the execution time of jobs are known. Through extensive simulation experiments on popular parallel workloads, our proposed TVC embedding algorithm AdaBa achieves up to 15 percent of improvement on accept rate compared with existing algorithms such as Oktupus and greedy algorithm. Our proposed BgMBF and BgMBFSDF also significantly outperform other popular scheduling algorithms integrated with AdaBa on average response time and average bounded slow down.

Highlights

Based on virtualization, data management techniques, etc., cloud computing paradigm delivers cost-effective and powerful Infrastructure as a Service (IaaS), and flexible and customizedPlatform as a Service (PaaS) and Software as a Service (SaaS), which allow agile customization to specific applications, software, and programming environment needs of users
When a time-aware virtual cluster (TVC) is accepted by the cloud, Ni VMs are deployed onto the idle slots of the servers in the cloud data center, and the residual bandwidth capabilities of links along paths routing to the corresponding servers are reduced
The management and scheduling of parallel jobs in the cloud can be treated as a variant of job scheduling problem that integrates with TVC embedding

Summary

Introduction

Data management techniques, etc., cloud computing paradigm delivers cost-effective and powerful Infrastructure as a Service (IaaS), and flexible and customized. Proposed a time-aware virtual cluster request model which can be used to specify an estimated required time-duration for jobs, and designed several online heuristic algorithms to allocate resources for scheduled requests. To improve the accept rate of jobs, the scheduling algorithm tries to pick the most suitable requested jobs to execute in the cloud according to the bandwidth and time duration profiles. Inspired by Dalvandi et al [14], we design a time-aware virtual cluster (TVC) request model to specific the resource demands of parallel jobs. The contributions of this paper are shown as follows: We propose an efficient adaptive bandwidth-aware virtual cluster embedding algorithm to allocate requested resources of virtual cluster for scheduled parallel jobs running in the cloud, which excavates more bandwidth resources on the links through adaptive communication hidden strategy to improve the accept rate for following arriving jobs.

Bandwidth Allocation in the Cloud

Parallel Job Scheduling

Data Center Network Model

Virtual Cluster Model

Problem Definition

Constrains

Proposed Algorithms

Simulation Settings

Workload

Performance of TVC Embedding

Scheduling Performance

Conclusions

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Symmetry	Publication Date: Apr 25, 2018
Citations: 1	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Bandwidth-Guaranteed Resource Allocation and Scheduling for Parallel Jobs in Cloud Data Center

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Symmetry

Lead the way for us

Similar Papers

Reducing the number of response time service level objective violations by a cloud‐HPC convergence scheduler
Alessandro Kraemer ... Olivier Richard
Concurrency and Computation: Practice and Experience | VOL. 30
Alessandro Kraemer, et. al.Alessandro Kraemer ... Olivier Richard
02 Nov 2017
Concurrency and Computation: Practice and Experience | VOL. 30

Classifying jobs towards power-aware HPC system operation through long-term log analysis
Yuichi Tsujita ... Atsuya Uno
Array | VOL. 15
Yuichi Tsujita, et. al.Yuichi Tsujita ... Atsuya Uno
01 Sep 2022
Array | VOL. 15

Network slicing to improve multicasting in HPC clusters
Izzat Alsmadi ... Dianxiang Xu
Cluster Computing | VOL. 21
Izzat Alsmadi, et. al.Izzat Alsmadi ... Dianxiang Xu
31 Jan 2018
Cluster Computing | VOL. 21

TMVCE—topology-aware multipath Virtual Cluster embedding algorithm
Rongzhen Li ... Yusong Tan
-
Rongzhen Li, et. al. Rongzhen Li ... Yusong Tan
01 Oct 2016
01 Oct 2016

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Bandwidth-Guaranteed Resource Allocation and Scheduling for Parallel Jobs in Cloud Data Center

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Symmetry