Abstract

Containerisation demonstrates its efficiency in application deployment in Cloud Computing. Containers can encapsulate complex programs with their dependencies in isolated environments making applications more portable, hence are being adopted in High Performance Computing (HPC) clusters. Singularity, initially designed for HPC systems, has become their de facto standard container runtime. Nevertheless, conventional HPC workload managers lack micro-service support and deeply-integrated container management, as opposed to container orchestrators. We introduce a Torque-Operator which serves as a bridge between HPC workload manager (TORQUE) and container orchestrator (Kubernetes). We propose a hybrid architecture that integrates HPC and Cloud clusters seamlessly with little interference to HPC systems where container orchestration is performed on two levels.

Highlights

  • We present a hybrid architecture that is composed of an High Performance Computing (HPC) cluster and a Cloud cluster, where container orchestration on the HPC cluster can be performed by the container orchestrator (i.e. Kubernetes) located in the Cloud cluster

  • Related work This paper extends our work-in-progress study [39] that has briefly described the preliminary design of the Torque-Operator and platform architecture, which enables the convergence of HPC and Cloud systems

  • To demonstrate the performance improvement that the HPC cluster can bring and illustrate that the approaches introduced can be applied in more general cases, we present performance evaluation on an Message Passing Interface (MPI) benchmark Bayesian Probabilistic Matrix Factorization (BPMF) [56, 57]

Read more

Summary

Introduction

Rather than simulating the holistic operating system (OS) as in a Virtual Machine (VM), containers only share the host OS. This feature makes containers more lightweight than VMs. Containers are dedicated to run micro-services [3] and one container mostly hosts one application. An HPC cluster is typically equipped with a workload manager. A workload manager is composed of a resource manager and a job scheduler. A container orchestrator, such as Kubernetes [3], on its own does not address all the requirements of HPC systems, cannot replace existing workload managers in HPC centres. HPC workload managers, such as TORQUE, lack micro-service support and deeply-integrated container management capabilities in which container orchestrators manifest their efficiency. “Conclusion and future work” section concludes this paper and proposes future work

Background
Findings
Discussion
Conclusion and future work
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call