Abstract

High Performance Computing (HPC) facilities provide vast computational power and storage, but generally work on fixed environments designed to address the most common software needs locally, making it challenging for users to bring their own software. To overcome this issue, most HPC facilities have added support for HPC friendly container technologies such as Shifter, Singularity, or Charliecloud. These different container technologies are all compatible with the more popular Docker containers, however the implementation and use of said containers is different for each HPC friendly container technology. These usage differences can make it difficult for an end user to easily submit and utilize different HPC sites without making adjustments to their workflows and software. This issue is exacerbated when attempting to utilize workflow management software between different sites with differing container technologies.The SCAILFIN project aims to develop and deploy artificial intelligence (AI) and likelihood-free inference (LFI) techniques and software using scalable cyberinfrastructure (CI) that span multiple sites. The project has extended the CERN-based REANA framework, a platform designed to enable analysis reusability, and reproducibility while supporting different workflow engine languages, in order to support submission to different HPC facilities. The work presented here focuses on the development of an abstraction layer that allows the support of different container technologies and different transfer protocols for files and directories between the HPC facility and the REANA cluster edge service from the user’s workflow application.

Highlights

  • Detecting and supporting different container technologiesUsers can interact with the REANA cluster through a python-based client package

  • The REANA framework [1], used by the SCAILFIN project [2] to create analysis workflows that are reproducible, works on top of Kubernetes in order to orchestrate all service components and the scheduling of workers to run the payloads

  • While the REANA team is working on supporting different submission backends in the framework (HTCondor [3] and SLURM [4] at present), these are currently focused on working with CERN resources [5]

Read more

Summary

Detecting and supporting different container technologies

Users can interact with the REANA cluster through a python-based client package This client allows the user to create workflows, upload files to the workflow area and define the docker container images needed to run each step in the workflow (see Figure 3). This allows workflows to run on multiple HPC facilities supporting container technologies, without involving the user in providing the specific parameters per container technology for launching the container images

Data transfer management
Conclusions
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call