Abstract

Due to the increasing volume of data for applications running on geographically distributed Cloud systems, the need for efficient data management has emerged as a crucial performance factor. Alongside basic task scheduling, the management of input data on distributed Cloud systems has become a genuine challenge, particularly with data-intensive applications. Ideally, each dataset should be stored in the same data center as its consumer tasks so as to lead to local data accesses only. However, when a given task does not need all items within one of its input datasets, sending that dataset entirely might lead to a severe time overhead. To address this concern, a data fragmentation strategy can be considered in order to partition the datasets and process them in that form. Such a strategy should be flexible enough to support any user-defined partitioning, and suitable enough to minimize the overhead of transferring the data in their fragmented form. To simulate and estimate the basic statistics of both fragmentation and migration mechanisms prior to an implementation in a real Cloud, we chose Cloudsim, with the goal of enhancing it with the corresponding extensions. Cloudsim is a popular simulator for Cloud Computing investigations. Our proposed extension is named DFMCloudsim, its goal is to provide an efficient module for implementing fragmentation and data migration strategies. We validate our extension using various simulated scenarios. The results indicate that our extension effectively achieves its main objectives and can reduce data transfer overhead by 74.75% compared to our previous work.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call