Abstract

Cloud computing has emerged as a main approach for managing huge distributed data in different areas such as scientific operations and engineering experiments. In this regard, data replication in Cloud environments is a key strategy that reduces response time and improves reliability. One of the main features of a distributed environment is to replicate data in various sites such that popular data would be more available. Whenever a site does not have a needed data file, it will have to fetch it from other locations. Therefore, the parallel download approach is applied to reduce download time. It enables a user to get various parts of a file from several sites simultaneously. In this work, we present a data replication strategy, named the Dynamic Popularity aware Replication Strategy (DPRS), which is presented on Cloud system leveraging data access behavior. DPRS replicates only a small amount of frequently requested data file based on 80/20 idea. It determines to which site the file is replicated based on number of requests, free storage space, and site centrality. We introduce a parallel downloading approach that replicates data segments and parallel downloads replicated data fragments, to enhance the overall performance. We evaluate effective network usage, mean job execution time, hit ratio, total number of replications and percentage of storage filled by using the CloudSim simulator. Extensive experimentations demonstrate the effectiveness of DPRS under most of access patterns.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call