Efficient Scheduling of Scientific Workflows Using Hot Metadata in a Multisite Cloud

Ji Liu,Esther Pacitti,Marta Mattoso,Luis Pineda,Alexandru Costan,Patrick Valduriez,Gabriel Antoniu

doi:10.1109/tkde.2018.2867857

Abstract

Large-scale, data-intensive scientific applications are often expressed as scientific workflows (SWfs). In this paper, we consider the problem of efficient scheduling of a large SWf in a multisite cloud, i.e., a cloud with geo-distributed cloud data centers (sites). The reasons for using multiple cloud sites to run a SWf are that data is already distributed, the necessary resources exceed the limits at a single site, or the monetary cost is lower. In a multisite cloud, metadata management has a critical impact on the efficiency of SWf scheduling as it provides a global view of data location and enables task tracking during execution. Thus, it should be readily available to the system at any given time. While it has been shown that efficient metadata handling plays a key role in performance, little research has targeted this issue in multisite cloud. In this paper, we propose to identify and exploit hot metadata (frequently accessed metadata) for efficient SWf scheduling in a multisite cloud, using a distributed approach. We implemented our approach within a scientific workflow management system, which shows that our approach reduces the execution time of highly parallel jobs up to 64 percent and that of the whole SWfs up to 55 percent.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: IEEE Transactions on Knowledge and Data Engineering	Publication Date: Nov 21, 2017
Citations: 56	License type: other-oa

R Discovery Prime

R Discovery Prime

Efficient Scheduling of Scientific Workflows Using Hot Metadata in a Multisite Cloud

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Knowledge and Data Engineering

Lead the way for us

Similar Papers

Scientific Workflow Scheduling with Provenance Support in Multisite Cloud
Ji Liu ... Esther Pacitti
-
Ji Liu, et. al.Ji Liu ... Esther Pacitti
01 Jan 2017
01 Jan 2017

Scientific Workflow Scheduling with Provenance Data in a Multisite Cloud
Ji Liu ... Patrick Valduriez
-
Ji Liu, et. al.Ji Liu ... Patrick Valduriez
01 Jan 2017
01 Jan 2017

Scientific Workflow Partitioning in Multisite Cloud
Ji Liu ... Marta Mattoso
-
Ji Liu, et. al.Ji Liu ... Marta Mattoso
01 Jan 2014
01 Jan 2014

A Survey of Data-Intensive Scientific Workflow Management
Ji Liu ... Marta Mattoso
Journal of Grid Computing | VOL. 13
Ji Liu, et. al.Ji Liu ... Marta Mattoso
08 Mar 2015
Journal of Grid Computing | VOL. 13

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Efficient Scheduling of Scientific Workflows Using Hot Metadata in a Multisite Cloud

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Knowledge and Data Engineering