Abstract

Scientific Data Grid mostly deals with large computational problems. It provides geographically distributed resources for large-scale data-intensive applications that generate large scientific data sets. This required the scientist in modern scientific computing communities involved in managing massive amounts of a very large data collections that are geographically distributed. Research in the area of grid has given various ideas and solutions to address these requirements. However, nowadays the number of participants (scientists and institutions) that are involved in this kind of environment is increasing tremendously. This situation has lead to a problem of scalability. In order to overcome this problem we need a data grid model that can scale well with the increasing number of users. Peer-to-peer (P2P) is one of the architectures that is a promising scale and dynamism environment. In this paper, we present a P2P model for Scientific Data Grid that utilizes the P2P services to address the scalability problem. By using this model, we study and propose various decentralized discovery strategies that intend to address the problem of scalability. We also investigate the impact of data replication that addresses the data distribution and reliability problem for our Scientific Data Grid model on the propose discovery strategies. For the purpose of this study, we have developed and used our own data grid simulation written using PARSEC. We illustrate our P2P Scientific Data Grid model and our data grid simulation used in this study. We then analyze the performance of the discovery strategies with and without the existence of replication strategies relative to their success rates, bandwidth consumption and average number of hop.

Highlights

  • In modern scientific communities, the number of researchers involved in managing massive amounts of very large data collections through a geographically distributed environment is increasing

  • We have proposed and developed an unstructured P2P Scientific Data Grid model which is discussed

  • We show the results on the impact of data replication strategies that we proposed to address the data distribution and reliability problem for our Scientific Data Grid model, on the propose discovery strategies

Read more

Summary

INTRODUCTION

The number of researchers involved in managing massive amounts of very large data collections through a geographically distributed environment is increasing They need an infrastructure that can provide services to support their requirements to create a high-performance computing environment. We used a different approach in addressing the scalability problem by proposing a Scientific Data Grid model based on Peer-to-Peer (P2P) and this model used P2P services. Using this model we study various decentralized discovery strategies which the heart of this model.

RELATED WORKS
THE P2P DATA GRID MODEL
THE SIMULATION STUDY
EXPERIMENTS AND RESULTS
CONCLUSION AND FUTURE WORK
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call