Abstract

Abstract Future terabit networks are committed to dramatically improving big data motion between geographically dispersed HPC data centers. The scientific community takes advantage of the terabit networks such as DOE’s ESnet and accelerates the trend to build a small world of collaboration between geospatial HPC data centers. It improves information and resource sharing for joint simulation and analysis between the HPC data centers. However, there exist several challenges for effective collaborations such as a collective view of multi-site shared data, minimal performance degradation of scientific applications running in a such collaboration environments and critical of all, data sharing policies in such collaborations. In this paper, we propose to build SciSpace , Scientific Collaboration Workspace for collaborative data centers. It provides a global view of information shared from multiple geo-distributed HPC data centers under a single workspace. SciSpace supports native data-access to gain high-performance when data read or write is required in native data center namespace. It is accomplished by integrating an on-demand metadata export protocol. To optimize scientific collaborations across HPC data centers, SciSpace implements search and discovery service. To evaluate, we configured two geo-distributed small-scale HPC data centers connected via high-speed Infiniband network such as terabits network of DOE’s ESnet, equipped with LustreFS. We show the feasibility of SciSpace using real scientific datasets and applications. The evaluation results show average 36% performance boost when the proposed native-data access is employed in collaborations. We also emulate a real climate science collaboration to validate the usefulness of SciSpace .

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call