Abstract

InfiniCloud 2.0 is World’s first native InfiniBand High Performance Cloud distributed across four continents, spanning Asia, Australia, Europe and North America. The project provides researchers with instant access to computational, storage and network resources distributed around the globe. These resources are then used to build a geographically distributed, virtual supercomputer, complete with globally-accessible parallel file system and job scheduling.This paper describes high level design and the implementation details of InfiniCloud 2.0. A gene sequencing pipeline as well as plasma physics simulation code are used to demonstrate system’s capabilities.

Highlights

  • The original InfiniCloud system, presented at Supercomputing Frontiers Singapore in March 2015, enabled researchers to quickly and efficiently copy large volumes of data between Singapore and Australia, as well as to process that data using two discrete, native InfiniBand High Performance Clouds [8]

  • While the unique capabilities of InfiniCloud enabled new ways of processing data, it inspired a whole new range of research questions: Can the entire capacity of the system be aggregated? Do entire data collections need to be copied for processing, or can data be accessed in place? How does the InfiniCloud design scale to an arbitrary number of sites? How we ensure a consistent state of all InfiniCloud clusters? And can the resources across four continents be joined together using the InfiniCortex fabric to create a Galaxy of Supercomputers [14]?

  • In (Section 1) and (Section 2) we demonstrated the concept, design and implementation of a geographically distributed, High Performance Cloud system, capable of aggregating high performance computing resources available across four continents

Read more

Summary

Introduction

The original InfiniCloud system, presented at Supercomputing Frontiers Singapore in March 2015, enabled researchers to quickly and efficiently copy large volumes of data between Singapore and Australia, as well as to process that data using two discrete, native InfiniBand High Performance Clouds [8]. While the unique capabilities of InfiniCloud enabled new ways of processing data, it inspired a whole new range of research questions: Can the entire capacity of the system be aggregated? In this paper we aim to explore these research questions and propose new ways of utilizing distributed computation, storage and network resources, using a variety of novel tools and techniques. We take advantage of the expansion and enhancement of the InfiniCortex fabric which took place in 2015 [12], which includes full support for InfiniBand subnets and routing, greater available bandwidth and last but not least the growing number of participating sites.

The Network
Connecting to Europe and additional US based facilities
Routable InfiniBand
The Cloud
InfiniCloud rationale and the existing solutions
Cloud Architecture
Cloud implementation
Cloud Controller
Compute Nodes
Resource scheduling
Availability zones
Instance types
Communication patterns
Bandwidth and latency considerations
BeeGFS
BeeGFS configuration for high bandwidth-delay products
Optimizing data access patterns
ElastiCluster
Implementation of variant calling genome analysis pipeline
Geopipeline performance analysis
Extempore
Findings
Conclusions
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call