Abstract

Commercial Cloud computing is becoming mainstream, with funding agencies moving beyond prototyping and starting to fund production campaigns, too. An important aspect of any scientific computing production campaign is data movement, both incoming and outgoing. And while the performance and cost of VMs is relatively well understood, the network performance and cost is not. This paper provides a characterization of networking in various regions of Amazon Web Services, Microsoft Azure and Google Cloud Platform, both between Cloud resources and major DTNs in the Pacific Research Platform, including OSG data federation caches in the network backbone, and inside the clouds themselves. The paper contains both a qualitative analysis of the results as well as latency and peak throughput measurements. It also includes an analysis of the costs involved with Cloud-based networking.

Highlights

  • Commercial Cloud computing is gaining popularity in the realm of scientific computing

  • We created a set of files, each 1 GB in size, and uploaded them to the object storage in one Cloud region per tested commercial Cloud provider, namely one in Amazon Web Services (AWS), one in Azure and one in Google Cloud Platform (GCP)

  • Unlike most on-prem network infrastructures, networking is a billable entity in the commercial Clouds, with the final user being responsible for any and all the costs associated with data movement

Read more

Summary

Introduction

Commercial Cloud computing is gaining popularity in the realm of scientific computing. In the case of commercial Cloud resources, while the performance and cost of compute instances is relatively well documented and understood, the same cannot be said for network links and data movement at large. To address this deficiency, we ran a network characterization campaign in early autumn of 2019, collecting information about peak throughput and latencies in various regions of Amazon Web Services (AWS), Microsoft Azure and Google Cloud Platform (GCP), both between Cloud resources and major DTNs in the Pacific Research Platform (PRP/TNRP) [5], including Open Science Grid (OSG) [6] data federation caches in the Internet network backbone [7], and between different regions inside the Clouds themselves.

Networking within a Cloud region
Networking between regions within a Cloud provider
Networking between commercial Cloud resources and on-prem
Fetching data from the Clouds
Fetching data into the Clouds
Comparing xrootd over HTTP against GridFTP
Commercial Cloud networking cost
Conclusions
Findings
CloudBank
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call