Abstract
Increased operational effectiveness and the dynamic integration of only temporarily available compute resources (opportunistic resources) becomes more and more important in the next decade, due to the scarcity of resources for future high energy physics experiments as well as the desired integration of cloud and high performance computing resources. This results in a more heterogenous compute environment, which gives rise to huge challenges for the computing operation teams of the experiments. At the Karlsruhe Institute of Technology (KIT) we design solutions to tackle these challenges. In order to ensure an efficient utilization of opportunistic resources and unified access to the entire infrastructure, we developed the Transparent Adaptive Resource Dynamic Integration System (TARDIS). A scalable multi-agent resource manager providing interfaces to provision as well as dynamically and transparently integrate resources of various providers into one common overlay batch system. Operational effectiveness is guaranteed by relying on COBalD – the Opportunistic Balancing Daemon and its simple approach of taking into account the utilization and allocation of the different resource types, in order to run the individual workflows on the best-suited resource respectively. In this contribution we will present the current status of integrating various HPC centers and cloud providers into the compute infrastructure at the Karlsruhe Institute of Technology as well as our experiences gained in a production environment.
Highlights
Nowadays computing in high energy physics (HEP) is predominantly relying on homogenous resources provided by the World LHC Computing Grid (WLCG) [1] based on a flat-budget funding model
In contrast to the homogenous resources provided by the WLCG, utilising opportunistic resources results in a more heterogenous computing environment not fully-controlled by WLCG policies and imposing huge challenges to the computing operation teams of the experiments
We have presented a general multi-experiment solution on how to integrate opportunistic resources into the WLCG computing by associating them to existing WLCG sites close by and utilising well established Grid computing elements as single point of entry for the experiments
Summary
Nowadays computing in high energy physics (HEP) is predominantly relying on homogenous resources provided by the World LHC Computing Grid (WLCG) [1] based on a flat-budget funding model. Recent studies of the HEP Software Foundation [2], legitimately assuming a continuity of the flat-budget funding model, show that the expected technology advance will not be sufficient to meet the computing requirements of future HEP experiments. One promising approach to narrow the gap is to supplement the WLCG with resources not permanently dedicated, but temporarily available for HEP computing tasks. -called opportunistic resources are mainly provided by High Performance Computing (HPC) Centres as well as commercial and public cloud providers.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.