Abstract

The ATLAS experiment has successfully used its Gaudi/Athena software framework for data taking and analysis during the first LHC run, with billions of events successfully processed. However, the design of Gaudi/Athena dates from early 2000 and the software and the physics code has been written using a single threaded, serial design. This programming model has increasing difficulty in exploiting the potential of current CPUs, which offer their best performance only through taking full advantage of multiple cores and wide vector registers. Future CPU evolution will intensify this trend, with core counts increasing and memory per core falling. With current memory consumption for 64 bit ATLAS reconstruction in a high luminosity environment approaching 4GB, it will become impossible to fully occupy all cores in a machine without exhausting available memory. However, since maximizing performance per watt will be a key metric, a mechanism must be found to use all cores as efficiently as possible.In this paper we report on our progress with a practical demonstration of the use of multithreading in the ATLAS reconstruction software, using the GaudiHive framework. We have expanded support to Calorimeter, Inner Detector, and Tracking code, discussing what changes were necessary in order to allow the serially designed ATLAS code to run, both to the framework and to the tools and algorithms used. We report on both the performance gains, and what general lessons were learned about the code patterns that had been employed in the software and which patterns were identified as particularly problematic for multi-threading. We also present our findings on implementing a hybrid multi-threaded / multi-process framework, to take advantage of the strengths of each type of concurrency, while avoiding some of their corresponding limitations.

Highlights

  • Given that Athena is based on Gaudi, trying out Gaudi Hive is an obvious first step in multi-threaded frameworks

  • Explore what needs to be modified in user code and framework to make it functional

  • Best speedup of event processing wrt serial, is about 3.3x, with a 28% increase in memory consumption

Read more

Summary

Gaudi Hive

Given that Athena is based on Gaudi, trying out Gaudi Hive is an obvious first step in multi-threaded frameworks. ► by having Algorithms declare their inputs and outputs, the scheduler can automatically execute Algorithms as data becomes available. ► Build a directed acyclic graph of Algorithm dependencies. ► multiple instances of the same Algorithm can exist, each with different Event Context. ► cloning is not obligatory, balancing memory usage with thread safety. Extract Algorithm/Data dependency graph, and timing data from running normal Atlas Reconstruction on ttbar event. Create fake CPU Cruncher Algorithm to mimic CPU usage of each Reconstruction Algorimth using real timing data. ► # Events in Flight (parallel events) ► # Algs in Flight (limit number of simultaneously executing Algorithms) ► # Threads ► Cloning (multiple instances of each Algorithm). Speedup wrt Serial Hive vs Number of Threads serial hive = 1 event in flight, 1 alg in flight, 1 thread 25

Number of Threads
Calo Testbed Memory Usage and Timing
Findings
Summary
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call