Abstract

Workflow scheduling is crucial to the efficient operation of cloud platforms, and has attracted a lot of attention. Up to now, many algorithms have been reported to schedule workflows with budget constraints, so as to optimize workflows' makespan on cloud resources. Nevertheless, the hourly-based billing model in cloud computing is an ongoing challenge for workflow scheduling that easily results in higher makespan or even infeasible solutions. Besides, due to data constraints among workflow tasks, there must be a lot of idle slots in cloud resources. Few works adequately exploit these idle slots to duplicate tasks' predecessors to shorten their completion time, thereby minimizing workflow's makespan while ensuring its budget constraint. Motivated by these, we propose a task duplication based scheduling algorithm, namely TDSA, to optimize makespan for budget-constrained workflows in cloud platforms. In TDSA, two novel mechanisms are devised: 1) a dynamic sub-budget allocation mechanism, it is responsible for recovering unused budget of scheduled workflow tasks and redistributing remaining budget, which is conducive to using more expensive/powerful cloud resources to accelerate completion time of unscheduled tasks; and 2) a duplication-based task scheduling mechanism, which strives to exploit idle slots on resources to selectively duplicate tasks' predecessors, such advancing these tasks' completion time while trying to ensuring their sub-budget constraints. At last, we carry out four groups of experiments, three groups on randomly generated workflows and another one on actual workflows, to compare the proposed TDSA with four baseline algorithms. Experimental results confirm that the TDSA has an overwhelming superiority in advancing the workflows' makespan (up to 17.4%) and improving the utilization of cloud computing resources (up to 31.6%).

Highlights

  • A S a new computing paradigm, cloud computing provides end-users with highly scalable applications, platforms, and hardware as services through the Internet [1]

  • Cloud computing developed rapidly in recent years, and has been widely used for processing big data applications coming from various fields, such as astronomy [3], [4], healthcare [5], bioinformatics [6], intelligent transportation [7], and Internet of Things [8], [9]

  • Each stage contains a large number of tasks, and the input data of these tasks is the output data of the tasks in the previous stage [13]. These applications are termed as workflows in distributed community [14], and one workflow can be formulated as a Directed Acyclic Graph (DAG), where nodes stand for its tasks and edges indicate the data dependencies among tasks [15]

Read more

Summary

INTRODUCTION

A S a new computing paradigm, cloud computing provides end-users with highly scalable applications, platforms, and hardware as services through the Internet [1]. Zhou et al defined a balance factor for workflow tasks based on their optimistic spare deadlines and budgets, and designed a resource selection approach for workflow tasks to improve the possibility of ensuring workflows’ deadline and budget requirements [34] These existing works ignore the time slots in resources left by data dependencies among workflow tasks, and do not exploit the advantages of task duplication to enhance tasks’ start and finish time. To ensure reliability constraints or improve the fault-tolerant capacity, task replication mechanisms were designed for scheduling workflows in cloud computing [39]–[41] These two works did not consider the budget constraint of workflows, and the task replication mechanisms tend to replicate most tasks multiple times, regardless of whether existing slots are available, which increases resource usage significantly and makes it easier to cost more than workflows’ budget.

PROBLEM DESCRIPTION
OPTIMIZATION MODEL
EXPERIMENTAL STUDIES
EXPERIMENT DESIGN
Findings
CONCLUSIONS AND FUTURE WORK

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.