Abstract

Scalable multithreading poses challenges to I/O for the ATLAS experiment. The performance of a thread-safe I/O strategy may depend upon many factors, including I/O latencies, whether tasks are CPU- or I/O-intensive, and thread count. In a multithreaded framework, an I/O infrastructure must efficiently supply event data to and collect it from many threads processing multiple events in flight. In particular, on-demand reading from multiple threads may challenge caching strategies that were developed for serial processing and may need to be enhanced. This I/O infrastructure must also address how to read, make available, and propagate in-file metadata and other non-event data needed as context for event processing. We describe the design and scheduling of I/O components in the ATLAS multithreaded control framework, AthenaMT, for both event and non-event I/O. We discuss issues associated with exploiting the multithreading capabilities of our underlying persistence technology, ROOT, in a manner harmonious with the ATLAS framework?s own approach to thread management. Finally, we discuss opportunities for evolution and simplification of I/O components that have successfully supported ATLAS event processing for many years from their serial incarnations to their thread-safe counterparts.

Highlights

  • The period of data taking at the LHC will stress the ATLAS [1] computing infrastructure in two important ways, the event rate and the event complexity

  • High Level Trigger (HLT): The increasing complexity of events challenges both the memory capacity and throughput of the computing farms used for the HLT

  • ATLAS already makes extensive use of ROOT?sTTreeCache system when reading ROOT data. This system was designed for reading data sequentially, whereas a multithreaded framework will process events in no particular order.This, combined with on-demand reading, can lead to cache thrashing at cache boundaries, when retrieval of a data object for a not-yet-cached event causes the cache to be flushed while the framework is still processing an event that may still need to retrieve data from the current cache

Read more

Summary

Introduction

The period of data taking at the LHC will stress the ATLAS [1] computing infrastructure in two important ways, the event rate and the event complexity. Combined with computing architectures which scale by adding cores and coprocessors rather than clock cycles, this requires an approach which emphasizes parallelism, either using multiple processes across the cores or using multithreading within a single process. For memory constrained processes such as event reconstruction of raw data, this requires a multithreaded approach due to the increasing event complexity (expanding memory footprint) and the expanding core count per node (reduced memory/core). Moving data in and out of these processes is a necessary component to their efficient utilization

Use cases
AthenaMT
Multithreading on input
Multithreading on output
ROOT dependencies
Non-event data: metadata and conditions
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call