Scheduling Distributed Clusters of Parallel Machines : Primal-Dual and LP-based Approximation Algorithms

Riley Murray,Samir Khuller,Megan Chao

doi:10.1007/s00453-017-0345-x

Abstract

The Map-Reduce computing framework rose to prominence with datasets of such size that dozens of machines on a single cluster were needed for individual jobs. As datasets approach the exabyte scale, a single job may need distributed processing not only on multiple machines, but on multiple clusters. We consider a scheduling problem to minimize weighted average completion time of n jobs on m distributed clusters of parallel machines. In keeping with the scale of the problems motivating this work, we assume that (1) each job is divided into m “subjobs” and (2) distinct subjobs of a given job may be processed concurrently. When each cluster is a single machine, this is the NP-Hard concurrent open shop problem. A clear limitation of such a model is that a serial processing assumption sidesteps the issue of how different tasks of a given subjob might be processed in parallel. Our algorithms explicitly model clusters as pools of resources and effectively overcome this issue. Under a variety of parameter settings, we develop two constant factor approximation algorithms for this problem. The first algorithm uses an LP relaxation tailored to this problem from prior work. This LP-based algorithm provides strong performance guarantees. Our second algorithm exploits a surprisingly simple mapping to the special case of one machine per cluster. This mapping-based algorithm is combinatorial and extremely fast. These are the first constant factor approximations for this problem.

Highlights

It is becoming increasingly impractical to store full copies of large datasets on more than one data center [7]
Hung et al modeled each cluster as having an arbitrary number of identical parallel machines, and choose an objective of average job completion time
Hung et al proposed a particular algorithm for the controller called “SWAG.” SWAG performed well in a wide variety of simulations where each data center was assumed to have the same number of identical parallel machines

Summary

Introduction

It is becoming increasingly impractical to store full copies of large datasets on more than one data center [7]. Commercial platforms such as AWS Lambda and Microsoft’s Azure Service Fabric are demonstrating a trend of centralized cloud computing frameworks in which the user manages neither data flow nor server allocation [1, 11] In view of these converging issues, the following scheduling problem arises: If computation is done locally to avoid excessive network traffic, how can individual clusters on the broader grid coordinate schedules for maximum throughput?. Hung et al modeled each cluster as having an arbitrary number of identical parallel machines, and choose an objective of average job completion time. As such a problem generalizes the NP-Hard concurrent open shop problem, they proposed a heuristic approach. E.g., a 2-approximation when machines are of unit speed and subjobs are divided into sized (but not necessary unit) tasks

Formal Problem Statement

Example Problem Instances

Related Work

A permutation of the author’s names

The Core Linear Program

Statement of LP1

Proof of LP1’s Validity

Theoretical Complexity of LP1

List Scheduling from Permutations

An LP-based Algorithm

CC-LP for Uniform Machines

CC-LP for Identical Machines

Combinatorial Algorithms

A Degenerate Case for SWAG

CC-TSPT with Unit Tasks and Identical Machines

CC-ATSPT : Augmenting the LP Relaxation

Closing Remarks

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Algorithmica	Publication Date: Jul 19, 2017
Citations: 3	License type: cc-by

R Discovery Prime

R Discovery Prime

Scheduling Distributed Clusters of Parallel Machines : Primal-Dual and LP-based Approximation Algorithms

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Algorithmica

Lead the way for us

Similar Papers

Scheduling Distributed Clusters of Parallel Machines: Primal-Dual and LP-based Approximation Algorithms [Full Version
...
-
, et. al. ...
28 Oct 2016
28 Oct 2016

Tight Approximation Algorithms for Maximum Separable Assignment Problems
Lisa Fleischer ... Michel X Goemans
Mathematics of Operations Research | VOL. 36
Lisa Fleischer, et. al.Lisa Fleischer ... Michel X Goemans
01 Aug 2011
Mathematics of Operations Research | VOL. 36

Dispersion in Disks
Adrian Dumitrescu ... Minghui Jiang
Theory of Computing Systems | VOL. 51
Adrian Dumitrescu, et. al.Adrian Dumitrescu ... Minghui Jiang
11 May 2011
Theory of Computing Systems | VOL. 51

Tight approximation algorithms for maximum general assignment problems
...
-
, et. al. ...
22 Jan 2006
22 Jan 2006

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Scheduling Distributed Clusters of Parallel Machines : Primal-Dual and LP-based Approximation Algorithms

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Algorithmica