Abstract

Packaging parallel applications in containers has become increasingly popular on High Performance Computing (HPC) systems. These applications depend on the Message Passing Interface (MPI) for communication between processes in their parallel jobs. This paper explores the design considerations that container maintainers must weigh when building and running their applications on HPC systems. This includes highlighting and resolving the cgroup, namespace, and security boundaries used by some container runtimes that may hinder an MPI library from performing efficiently. This paper explores the impact on MPI libraries of two opposing container launch models, and various models for incorporating a system optimized MPI library including a novel hybrid BYO-MPI with system mounted components technique. This paper analyses the critical problem of cross-version compatibility between libraries interacting across the container boundary as the container image and the HPC system evolve over time. The paper concludes by translating the lessons learned from running MPI applications in traditional HPC systems to running in container orchestration environments like Kubernetes. The discussion around each of these topics will provide a foundation for container maintainers to make informed choices that best suit their specific MPI application and HPC system requirements.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call