Performance Assessment of InfiniBand HPC Cloud Instances on Intel Haswell and Intel Sandy Bridge Architectures

Jonathan Low ,Jakub Chrzęszczyk ,Andrew Howard ,A Chrzeszczyk

doi:10.14529/jsfi150303

Abstract

This paper aims to establish a performance baseline of an HPC installation of OpenStack. We created InfiniCloud a distributed High Performance Cloud hosted on remote nodes of InfiniCortex. InfiniCloud compute nodes use high performance Intel R Haswell and Sandy Bridge CPUs, SSD storage and 64-256GB RAM. All computational resources are connected by high performance IB interconnects and are capable of trans-continental IB communication using Obsidian Longbow range extenders.We benchmark the performance of our test-beds using micro-benchmarks for TCP bandwidth, IB bandwidth and latency, file creation performance, MPI collectives and Linpack. This paper compares different CPU generations across virtual and bare-metal environments.The results show modest improvements in TCP and IB bandwidth and latency on Haswell; performance being largely dependent on the IB hardware. Virtual overheads were minimal and near-native performance is possible for sufficiently large messages. From the Linpack testing, users can expect the performance in their applications on Haswell-provisioned VMs more than twice. On Haswell hardware, native and virtual performance differences is still significant for MPI collective operations. Finally, our parallel filesystem testing revealed virtual performance coming close to native only for non-sync/fsync file operations.

Highlights

Cloud computing offers resources on-demand as an Infrastructure-as-a-Service (IaaS) platform, providing good flexibility in resource allocation and usage that can be managed by both end-users and administrators
Since 2009, the National Computational Infrastructure (NCI) in Australia have been providing a cloud computing platform service for compute and I/O-intensive workloads to their big data research community [2]
As the same Mellanox hardware was capable of 56Gb InfiniBand (IB) and Single Root IO Virtualisation (SR-IOV), A*CRC and NCI

Summary

Introduction

Cloud computing offers resources on-demand as an Infrastructure-as-a-Service (IaaS) platform, providing good flexibility in resource allocation and usage that can be managed by both end-users and administrators. Encouraged by rapid adoption of the Cloud services, NCI enhanced the interconnect from 10Gb to 56Gb Ethernet using Mellanox hardware together with Single Root IO Virtualisation (SR-IOV) as a first phase This brings significant performance improvements to traditional HPC applications that typically require a fast interconnect. Network I/O remained a challenge to obtain near-native performance amongst virtual machines due to the packet processing, switching and CPU interruptions involved These overheads become very significant when attempting to make use of high speed interconnects that typical HPC workloads require and their associated features such as RDMA that needed to work effectively in virtual environments. To solve the network I/O problem, the SR-IOV technology was drawn up by the PCI Special Interest Group This is the hardware-based virtualisation method that allows near-native performance of network interfaces to be realised, where network I/O can bypass the hypervisor to avoid involvement of the CPU. Amazon Web Services provide SRIOV-enabled Gigabit Ethernet (GigE) for their C3 instances, the feature marketed as “Enhanced Networking” and there have been numerous performance studies for SRIOV-enabled Gigabit Ethernet and InfiniBand usage [3, 6, 8,9,10]

Objectives

Results

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Supercomputing Frontiers and Innovations	Publication Date: Sep 1, 2015
Citations: 1	License type: cc-by

R Discovery Prime

R Discovery Prime

Performance Assessment of InfiniBand HPC Cloud Instances on Intel Haswell and Intel Sandy Bridge Architectures

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Supercomputing Frontiers and Innovations

Lead the way for us

Similar Papers

Performance Optimization and Comparison of the Alternating Direction Implicit CFD Solver on Multi‐core and Many‐Core Architectures
Liang Deng ... Fang Wang
Chinese Journal of Electronics | VOL. 27
Liang Deng, et. al.Liang Deng ... Fang Wang
01 May 2018
Chinese Journal of Electronics | VOL. 27

On the Development of Variable Size Batched Computation for Heterogeneous Parallel Architectures
Ahmad Abdelfattah ... Stanimire Tomov
-
Ahmad Abdelfattah, et. al.Ahmad Abdelfattah ... Stanimire Tomov
01 May 2016
01 May 2016

Acceleration of PDE-based FTLE calculations on Intel multi-core and many-core architectures
Fang Wang ... Liang Deng
-
Fang Wang, et. al.Fang Wang ... Liang Deng
01 Dec 2015
01 Dec 2015

A fully integrated multi-CPU, GPU and memory controller 32nm processor
Marcelo Yuffe ... Joseph Shor
-
Marcelo Yuffe, et. al.Marcelo Yuffe ... Joseph Shor
01 Feb 2011
01 Feb 2011

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Performance Assessment of InfiniBand HPC Cloud Instances on Intel Haswell and Intel Sandy Bridge Architectures

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Supercomputing Frontiers and Innovations