Dynamic Fault-Tolerant Workflow Scheduling with Hybrid Spatial-Temporal Re-Execution in Clouds

Na Wu,Decheng Zuo,Zhan Zhang

doi:10.3390/info10050169

Abstract

Improving reliability is one of the major concerns of scientific workflow scheduling in clouds. The ever-growing computational complexity and data size of workflows present challenges to fault-tolerant workflow scheduling. Therefore, it is essential to design a cost-effective fault-tolerant scheduling approach for large-scale workflows. In this paper, we propose a dynamic fault-tolerant workflow scheduling (DFTWS) approach with hybrid spatial and temporal re-execution schemes. First, DFTWS calculates the time attributes of tasks and identifies the critical path of workflow in advance. Then, DFTWS assigns appropriate virtual machine (VM) for each task according to the task urgency and budget quota in the phase of initial resource allocation. Finally, DFTWS performs online scheduling, which makes real-time fault-tolerant decisions based on failure type and task criticality throughout workflow execution. The proposed algorithm is evaluated on real-world workflows. Furthermore, the factors that affect the performance of DFTWS are analyzed. The experimental results demonstrate that DFTWS achieves a trade-off between high reliability and low cost objectives in cloud computing environments.

Highlights

In recent years, scientific workflow has been applied widely as a new paradigm of data analysis and scientific computation [1]
Execution time of ti communication time between ti and t j critical path budget quota of ti reliability of ti with spatial re-execution (SRE) scheme reliability of ti with temporal re-execution (TRE) scheme type of virtual machine (VM) selected by ti price of V MT
Case 1: Instance Start Time (IST) − rt ≤ failure occurrence times (FOT) ≤ maxts ∈succ(ti ) ( AEET + CTis ), the failure occurs during the execution or data transmission of ti, that is, the transient failure recovers after the instance of ti starts, and before all data transfer from ti to ts finish

Summary

Introduction

Scientific workflow has been applied widely as a new paradigm of data analysis and scientific computation [1]. Workflows can be deployed and executed in clouds that provide a virtually infinite resource pool in a pay-as-you-go manner [5] In this way, workflows can acquire and release cloud resources on-demand to achieve a cost-effective operating mode. Workflows can acquire and release cloud resources on-demand to achieve a cost-effective operating mode These advantages enable clouds to become a preferred execution environment for scientific workflows. Without an effective fault-tolerant scheduling scheme, failures will cause deadline-aware workflows cannot complete on time. In this situation, the QoS is severely affected, the results might be obtained after the deadline. We propose a dynamic fault-tolerant workflow scheduling with hybrid spatial-temporal re-execution, called DFTWS.

Related Work

Preliminaries

Cloud System

Workflow Model

Fault Tolerance Schemes

Failure Model

Cost Model

DFTWS Algorithm

Static Node Information Calculation

Critical Path Identification

Initial Resource Allocation

Online Scheduling

Experimental Setup

Impact of DM

Impact of FR

Impact of Workflow Structure

Experimental Summary

Findings

Conclusions

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Information	Publication Date: May 5, 2019
Citations: 11	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Dynamic Fault-Tolerant Workflow Scheduling with Hybrid Spatial-Temporal Re-Execution in Clouds

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Information

Lead the way for us

Similar Papers

Multi-Objective Approach for Energy-Aware Workflow Scheduling in Cloud Computing Environments
Hubert Kadima ... Bertrand Granado
The Scientific World Journal | VOL. 2013
Hubert Kadima, et. al.Hubert Kadima ... Bertrand Granado
01 Jan 2013
The Scientific World Journal | VOL. 2013

Improved chaotic binary grey wolf optimization algorithm for workflow scheduling in green cloud computing
Mohammad Masdari ... Ali Mohammadzadeh
Evolutionary Intelligence | VOL. 14
Mohammad Masdari, et. al.Mohammad Masdari ... Ali Mohammadzadeh
11 Sep 2020
Evolutionary Intelligence | VOL. 14

Cost optimization approaches for scientific workflow scheduling in cloud and grid computing: A review, classifications, and open issues
Reza Rezaei ... Sai Peck Lee
The Journal of Systems & Software | VOL. 113
Reza Rezaei, et. al.Reza Rezaei ... Sai Peck Lee
02 Dec 2015
The Journal of Systems & Software | VOL. 113

Gene Optimized Deep Neural Round Robin Workflow Scheduling in Cloud
Shanmugasundaram M ... Kumar R
International Journal of Advanced Computer Science and Applications | VOL. 10
Shanmugasundaram M, et. al.Shanmugasundaram M ... Kumar R
01 Jan 2019
International Journal of Advanced Computer Science and Applications | VOL. 10

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Dynamic Fault-Tolerant Workflow Scheduling with Hybrid Spatial-Temporal Re-Execution in Clouds

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Information