Abstract

Data Grids support data-intensive applications in wide area Grid systems. They utilize local storage systems as distributed data stores by replicating datasets. Replication is a commonly used technique in a distributed environment. The motivation of replication is that replication can improve data availability, data access performance, and load balancing. Usually a complete file is copied to many Grid sites for local access. However, a site may only need parts of a replica. Therefore, to use the storage systems efficiently, it is necessary for a Grid site to store only parts of a replica. In this paper, we propose a concept called fragmented replicas. That is, when doing replication, a site can store only some partial contents needed locally. It can greatly save the storage space wasted in storing unused data. We also propose a block mapping procedure to determine the distribution of blocks in every available server for later replica retrieval. According to this procedure, a server can provide its available partial replica contents for other members in the Grid system to access. On the other hand, a client can retrieve a fragmented replica directly by using the block mapping procedure. After the block mapping procedure, some co-allocation schemes can be used to retrieve data sets from the available servers. The simulation shows that the co-allocation schemes also improve download performance in a fragmented replication system.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call