The ATLAS experiment has 18+ years of experience using workload management systems to deploy and develop workflows to process and to simulate data on the distributed computing infrastructure. Simulation, processing and analysis of LHC experiment data require the coordinated work of heterogeneous computing resources. In particular, the ATLAS experiment utilizes the resources of 250 computing centers worldwide, the power of supercomputing centres, and national, academic and commercial cloud computing resources. In this contribution, we present new techniques for cost-effectively improving efficiency introduced in workflow management system software. The evolution from a mesh framework to new types of computing facilities such as cloud and HPCs is described, as well as new types of production and analysis workflows.
Read full abstract