Abstract
Scientific Data Grid mostly deals with large computational problems. It provides geographically distributed resources for large-scale data-intensive applications that generate large scientific data sets. This required the scientist in modern scientific computing communities involved in managing massive amounts of a very large data collections that are geographically distributed. Research in the area of grid has given various ideas and solutions to address these requirements. However, nowadays the number of participants (scientists and institutions) that are involved in this kind of environment is increasing tremendously. This situation has lead to a problem of scalability. In order to overcome this problem we need a data grid model that can scale well with the increasing number of users. Peer-to-peer (P2P) is one of the architectures that is a promising scale and dynamism environment. In this paper, we present a P2P model for Scientific Data Grid that utilizes the P2P services to address the scalability problem. By using this model, we study and propose various decentralized discovery strategies that intend to address the problem of scalability. We also investigate the impact of data replication that addresses the data distribution and reliability problem for our Scientific Data Grid model on the propose discovery strategies. For the purpose of this study, we have developed and used our own data grid simulation written using PARSEC. We illustrate our P2P Scientific Data Grid model and our data grid simulation used in this study. We then analyze the performance of the discovery strategies with and without the existence of replication strategies relative to their success rates, bandwidth consumption and average number of hop.
Highlights
In modern scientific communities, the number of researchers involved in managing massive amounts of very large data collections through a geographically distributed environment is increasing
We have proposed and developed an unstructured P2P Scientific Data Grid model which is discussed
We show the results on the impact of data replication strategies that we proposed to address the data distribution and reliability problem for our Scientific Data Grid model, on the propose discovery strategies
Summary
The number of researchers involved in managing massive amounts of very large data collections through a geographically distributed environment is increasing They need an infrastructure that can provide services to support their requirements to create a high-performance computing environment. We used a different approach in addressing the scalability problem by proposing a Scientific Data Grid model based on Peer-to-Peer (P2P) and this model used P2P services. Using this model we study various decentralized discovery strategies which the heart of this model.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have