Abstract

E-science projects of various disciplines generate large amounts of data and face a fundamental challenge: thousands of researchers want to obtain new scientific results by logically relating subsets of the total volume of data. Considering the huge and widely distributed amounts of data, e-science communities investigate different technologies to provide fast access to the growing data sets. Among these technologies, Peer-to-Peer (P2P) and Data Grid are two models that fit these requirements well, because of their potential to provide a high quality of service with low cost. In this paper, we explore the possibility of using the P2P paradigm for data-intensive e-science applications on the Grid. We argue that additional support is required to achieve fast access to the huge and widely distributed amounts of data and propose eSciGrid to overcome the scalability barriers in today’s e-science communities. eSciGrid allows e-science communities to achieve a high query throughput through a decentralized protocol which integrates caching with query processing. The protocol takes into account the physical distance between peers and the amount of traffic carried by each node. The result of this integration is constant complexity for moderate queries and fast data transfers between Grid peers. Our results show that eSciGrid increases the performance of data access on e-science Grids.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call