Abstract

Replication is a well known technique to improve reliability and performance for a Data Grid. Keeping consistent content at all distributed replica is an important subject in Data Grid. Replica consistency protocol using classical propagation method called the radial method suffers from high overhead at the master replica site, while line method suffers from high delay time. In Data Grid not all replicas can be dealt with in the same way, since some will be in greater demand than others. Updating first replica having most demand, a greater number of clients would access the updated content in a shorter period of time. In this study, based on asynchronous aggressive update propagation technique, a scalable replica consistency protocol is developed to maintain replica consistency in data grid which aimed to reaching delay reduction and load balancing, such that the high access weight replicas updated faster than the others. The simulation results shows that the proposed protocol is capable of sending the updates to high access replicas in a short period while reducing the total update propagation repose time and reached load balancing.

Highlights

  • Large-scale scientific applications such as high energy physics, data mining, molecular modeling, earth sciences and large scale simulation produce large amount of datasets[1,2]

  • In our model each data file has Logical File Name (LFN) to denote a unique logical identifier for desired data content and each physical copy is specified by a unique Physical File Name (PHN), that specifies its location on a storage system

  • The time it takes for the update message to reach grid site is the propagation delay associated to that site, the comparison in Fig. 8 and 9 is expressed in average number of hubs needed for the updates to reach a grid site with a given range of access weight

Read more

Summary

Introduction

Large-scale scientific applications such as high energy physics, data mining, molecular modeling, earth sciences and large scale simulation produce large amount of datasets[1,2]. Data Grid has been using replication to reduce access latency, improve data locality, increase robustness, scalability and performance for distributed applications. Several data replication techniques[3,4] have been developed to support high-performance data access, improving data availability and load balancing to remotely produce scientific data. Most of those techniques do not provide the replica consistency in case of updates. Some applications do not require such dirty read to all replicas For these applications, in order to keep the consistency among replicas, data updates occurring on a master site should be immediately propagated to other site holding its replicas. Push based update propagation is more suitable for the application that need the updates to immediately reach the secondary replica site

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.