A data placement strategy in scientific cloud workflows

Dong Yuan,Yun Yang,Xiao Liu,Jinjun Chen

doi:10.1016/j.future.2010.02.004

Abstract

In scientific cloud workflows, large amounts of application data need to be stored in distributed data centres. To effectively store these data, a data manager must intelligently select data centres in which these data will reside. This is, however, not the case for data which must have a fixed location. When one task needs several datasets located in different data centres, the movement of large volumes of data becomes a challenge. In this paper, we propose a matrix based k-means clustering strategy for data placement in scientific cloud workflows. The strategy contains two algorithms that group the existing datasets in k data centres during the workflow build-time stage, and dynamically clusters newly generated datasets to the most appropriate data centres–based on dependencies–during the runtime stage. Simulations show that our algorithm can effectively reduce data movement during the workflow’s execution.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

A data placement strategy in scientific cloud workflows

Abstract

Talk to us

Similar Papers

More From: Future Generation Computer Systems

Lead the way for us

Journal: Future Generation Computer Systems	Publication Date: Feb 18, 2010
Citations: 350

Similar Papers

A Two-Stage Fuzzy C-Means Data Placement Strategy for Scientific Cloud Workflows
Hamdi Kchaou ... Adel M Alimi
-
Hamdi Kchaou, et. al.Hamdi Kchaou ... Adel M Alimi
01 Jul 2018
01 Jul 2018

Towards Intelligent Data Placement for Scientific Workflows in Collaborative Cloud Environment
Xin Liu ... Anwitaman Datta
-
Xin Liu, et. al.Xin Liu ... Anwitaman Datta
01 May 2011
01 May 2011

A Novel Workflow-Level Data Placement Strategy for Data-Sharing Scientific Cloud Workflows
Xuejun Li ... Huikang Yi
IEEE Transactions on Services Computing | VOL. 12
Xuejun Li, et. al.Xuejun Li ... Huikang Yi
29 Apr 2019
IEEE Transactions on Services Computing | VOL. 12

A Data Placement Strategy for Data-Intensive Scientific Workflows in Cloud
Qing Zhao ... Jian Xiao
-
Qing Zhao, et. al.Qing Zhao ... Jian Xiao
01 May 2015
01 May 2015

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A data placement strategy in scientific cloud workflows

Abstract

Talk to us

Similar Papers

More From: Future Generation Computer Systems