Abstract
ATLAS's current software framework, Gaudi/Athena, has been very successful for the experiment in LHC Runs 1 and 2. However, its single-threaded design has been recognised for some time to be increasingly problematic as CPUs have increased core counts and decreased available memory per core. Even the multi-process version of Athena, AthenaMP, will not scale to the range of architectures we expect to use beyond Run2.ATLAS examined the requirements on an updated multi-threaded framework and laid out plans for a new framework, including better support for High Level Trigger use cases, in 2014. In this paper we report on our progress in developing the new multi-threaded task parallel extension of Athena, AthenaMT.Implementing AthenaMT has required many significant code changes. Progress has been made in updating key concepts of the framework, allowing different levels of thread safety in algorithmic code. Substantial advances have also been made in implementing a data flow centric design, as well as on the development of the new ‘event views’ infrastructure. These event views support partial event processing and are an essential component to support the High Level Trigger's processing of certain regions of interest. A major effort has also been invested to have an early version of AthenaMT that can run simulation on many core architectures, which has augmented the understanding gained from work on earlier ATLAS demonstrators.
Highlights
Dark silicon might start to dominate in the future — specialist computing units lit up only when needed
Multi-processing with copy on write (Athena MultiProcess, or AthenaMP) is serving ATLAS well in Run2, but we don’t expect this to scale for Run3
Similar ATLAS CaloHive example had demonstrated memory savings could be achieved in practice
Summary
• Great challenge of computing in the next decade will be one of power • nJ per instruction • Note it is likely that the power costs of memory access would be greater than CPU power in an exascale machine • This is driving evolution of larger numbers of cores on dies • More transistors but no more clock speed • And lower amounts of memory per core
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.