Abstract

Effective data management is a crucial problem in distributed systems such as data grid and cloud. This can be achieved by replicating file in a wise manner, which reduces data access time, increases data availability, reliability and system load balancing. Determining a reasonable number and appropriate location of replicas is essential decision in cloud computing. In this paper, a new dynamic replication strategy called Data Mining-based Data Replication (DMDR) is proposed, which determines the correlation of the data files accessed using the file access history. We focus particularly on how extracted knowledge with maximal frequent correlated pattern mining improves data replication. We can group files with high dependency in the same replica set. Through the DMDR strategy, replicas can be stored in the suitable locations, with reduced access latency according to the centrality factor. In addition, due to the finite storage space of each node, replicas that are useful for future tasks can be wastefully deleted and replaced with less beneficial ones. Results of simulation using CloudSim indicate that DMDR strategy has a relative advantage in effective network usage, average response time, hit ratio in comparison with current methods. It can be concluded from this investigation that data mining technique is effective and helpful in the finding of users’ future access behavior in cloud environment.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call