Data-Aware Scheduling Strategy for Scientific Workflow Applications in IaaS Cloud Computing

Sid Ahmed Makhlouf,Belabbas Yagoubi

doi:10.9781/ijimai.2018.07.002

Abstract

Scientific workflows benefit from the cloud computing paradigm, which offers access to virtual resources provisioned on pay-as-you-go and on-demand basis. Minimizing resources costs to meet user’s budget is very important in a cloud environment. Several optimization approaches have been proposed to improve the performance and the cost of data-intensive scientific Workflow Scheduling (DiSWS) in cloud computing. However, in the literature, the majority of the DiSWS approaches focused on the use of heuristic and metaheuristic as an optimization method. Furthermore, the tasks hierarchy in data-intensive scientific workflows has not been extensively explored in the current literature. Specifically, in this paper, a data-intensive scientific workflow is represented as a hierarchy, which specifies hierarchical relations between workflow tasks, and an approach for data-intensive workflow scheduling applications is proposed. In this approach, first, the datasets and workflow tasks are modeled as a conditional probability matrix (CPM). Second, several data transformation and hierarchical clustering are applied to the CPM structure to determine the minimum number of virtual machines needed for the workflow execution. In this approach, the hierarchical clustering is done with respect to the budget imposed by the user. After data transformation and hierarchical clustering, the amount of data transmitted between clusters can be reduced, which can improve cost and makespan of the workflow by optimizing the use of virtual resources and network bandwidth. The performance and cost are analyzed using an extension of Cloudsim simulation tool and compared with existing multi-objective approaches. The results demonstrate that our approach reduces resources cost with respect to the user budgets.

Highlights

IN recent years, cloud environments are increasingly used in the scientific field [1]
We focus on the following research question: What is the number of virtual machines required for the efficient and transparent execution of a workflow in a cloud environment?
Our work aims to reduce the monetary cost of data movements during workflow execution, to improve the use of the network in Cloud environment

Summary

Introduction

IN recent years, cloud environments are increasingly used in the scientific field [1]. Majority of the Workflow scheduling approaches focus on employing heuristic and meta-heuristic as an optimization method and focusing only on the execution time [7]. Even in these cases, communication among tasks is assumed to take zero time units. Traditional techniques have examined the data sharing of workflows tasks These techniques that investigate the scheduling of scientific workflows tasks have inspired us when developing our approach. We propose a novel approach for workflow scheduling considering the hierarchy of scientific workflows tasks. We focus on the following research question: What is the number of virtual machines required for the efficient and transparent execution of a workflow in a cloud environment?.

Related Works

Problem Statement

Application Model

Execution Model

Data Transfer Model

Proposed Approach

Task Clustering Based on Conditional Probability

Transforming Data

Tasks Distance Measures

Cluster Analysis Method and VMs Number Interval

Hierarchical Clustering Method

Objective Function

Evaluation Methods

Experiment 1

Experiment 2

Experiment 3

VIII. Conclusion

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: International Journal of Interactive Multimedia and Artificial Intelligence	Publication Date: Jan 1, 2019
Citations: 5	License type: cc-by

R Discovery Prime

R Discovery Prime

Data-Aware Scheduling Strategy for Scientific Workflow Applications in IaaS Cloud Computing

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: International Journal of Interactive Multimedia and Artificial Intelligence

Lead the way for us

Similar Papers

Clustering Strategy for Scientific Workflow Applications in IaaS Cloud Environment
Sid Ahmed Makhlouf ... Belabbas Yagoubi
-
Sid Ahmed Makhlouf, et. al.Sid Ahmed Makhlouf ... Belabbas Yagoubi
25 Oct 2018
25 Oct 2018

Data-Locality Aware Scientific Workflow Scheduling Methods in HPC Cloud Environments
Theodora Adufu ... Jieun Choi
International Journal of Parallel Programming | VOL. 45
Theodora Adufu, et. al.Theodora Adufu ... Jieun Choi
29 Sep 2016
International Journal of Parallel Programming | VOL. 45

A Cost-Effective Deadline-Constrained Dynamic Scheduling Algorithm for Scientific Workflows in a Cloud Environment
Prakash Vidyarthi ... Jyoti Sahni
IEEE Transactions on Cloud Computing | VOL. 6
Prakash Vidyarthi, et. al.Prakash Vidyarthi ... Jyoti Sahni
01 Jan 2018
IEEE Transactions on Cloud Computing | VOL. 6

Special issue: second international workshop on workflow management in service and cloud computing (WMSC2010)
Jinjun Chen ... Rajiv Ranjan
Concurrency and Computation: Practice and Experience | VOL. 25
Jinjun Chen, et. al.Jinjun Chen ... Rajiv Ranjan
04 Feb 2013
Special issue: second international workshop on workflow management in service and cloud computing (WMSC2010)
Jinjun Chen ... Rajiv Ranjan

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Data-Aware Scheduling Strategy for Scientific Workflow Applications in IaaS Cloud Computing

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: International Journal of Interactive Multimedia and Artificial Intelligence