Data replication is a mechanism for creating a copy of the same file block on many sites. It is used in cloud storage systems to improve the performance of data availability. The current replication technique helps users fix the number of replicas required. The problem with the existing approaches is that the user cannot determine which file should be duplicated and how many copies are necessary, causing the cloud storage system's performance to suffer. The problem mentioned above has an impact on the performance of cloud storage systems. Thus, our proposed replication method determines the replication factor based on the support values of the file blocks to determine the precise number of duplicates to be replicated. We have also proposed an efficient technique to place the replicas based on the local support values to increase the performance of the cloud storage system. Our results indicate that our proposed replication algorithm performs better than the algorithm used in the Hadoop distributed file system.
Read full abstract