Performance evaluation of distributed file systems for the phase-II upgrade of the ATLAS experiment at CERN

Adam Abed Abud,Fabrice Le Goff,Giuseppe Avolio

doi:10.1088/1742-6596/1525/1/012028

Adam Abed Abud, Fabrice Le Goff + Show 1 more

Open Access

https://doi.org/10.1088/1742-6596/1525/1/012028

Copy DOI

Abstract

Over the next few years, the LHC will prepare for the upcoming High-Luminosity upgrade in which it is expected to deliver ten times more pp collisions. This will create a harsher radiation environment and higher detector occupancy. In this context, the ATLAS experiment, one of the general purpose experiments at the LHC, plans substantial upgrades to the detectors and to the trigger system in order to efficiently select events. Similarly, the Data Acquisition System (DAQ) will have to redesign the data-flow architecture to accommodate for the large increase in event and data rates. The Phase-II DAQ design involves a large distributed storage system that buffers data read out from the detector, while a computing farm (Event Filter) analyzes and selects the most interesting events. This system will have to handle 5.2 TB/s of input data for an event rate of 1 MHz and provide access to 3 TB/s of these data to the filtering farm. A possible implementation for such a design is based on distributed file systems (DFS) which are becoming ubiquitous among the big data industry. Features of DFS such as replication strategies and smart placement policies match the distributed nature and the requirements of the new data-flow system. This paper presents an up-to-date performance evaluation of some of the DFS currently available: GlusterFS, HadoopFS and CephFS. After characterization of the future data-flow systems workload, we report on small-scale raw performance and scalability studies. Finally, we conclude on the suitability of such systems to the tight constraints expected for the ATLAS experiment in phase-II and, in general, what benefits the HEP community can take from these storage technologies.

Highlights

The Large Hadron Collider (LHC) is a particle accelerator that collides proton bunches at a design center-of-mass energy of 14 TeV and at a rate of 40 MHz
The Dataflow system is responsible for buffering, transporting and formatting the event data, acting as an interface between the Readout and the trigger systems
A more refined selection is applied by the Event Filter (EF) which takes as input the L0 events stored in the Dataflow system

Summary

Introduction

The Large Hadron Collider (LHC) is a particle accelerator that collides proton bunches at a design center-of-mass energy of 14 TeV and at a rate of 40 MHz. KEYWORDS Distributed file systems · storage technologies · software-defined storage · ATLAS Data Acquisition · Dataflow 2. Phase-II TDAQ Architecture In order to take advantage of the full potential of the HL-LHC the trigger and data acquisition systems of the ATLAS experiment will be completely redesigned. The Dataflow system is responsible for buffering, transporting and formatting the event data, acting as an interface between the Readout and the trigger systems.

Results

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Journal of Physics: Conference Series	Publication Date: Apr 1, 2020
Citations: 2	License type: cc-by

R Discovery Prime

R Discovery Prime

Performance evaluation of distributed file systems for the phase-II upgrade of the ATLAS experiment at CERN

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Journal of Physics: Conference Series

Lead the way for us

Similar Papers

DataFlow Systems: From Their Origins to Future Applications in Data Analytics, Deep Learning, and the Internet of Things
Veljko Milutinovic ... Zoran Babovic
-
Veljko Milutinovic, et. al.Veljko Milutinovic ... Zoran Babovic
01 Jan 2017
01 Jan 2017

Multi-threaded evolution of the data-logging system of the ATLAS experiment at CERN
Tommaso Colombo ... Wainer Vandelli
-
Tommaso Colombo, et. al.Tommaso Colombo ... Wainer Vandelli
01 Oct 2011
01 Oct 2011

M2r2: A Framework for Results Materialization and Reuse in High-Level Dataflow Systems for Big Data
Vasiliki Kalavri ... Vladimir Vlassov
-
Vasiliki Kalavri, et. al.Vasiliki Kalavri ... Vladimir Vlassov
01 Dec 2013
01 Dec 2013

Scalable Analysis of Massive Graphs on a Parallel Data Flow System
Andy Yoo
-
Andy YooAndy Yoo
01 Jan 2009
01 Jan 2009

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Performance evaluation of distributed file systems for the phase-II upgrade of the ATLAS experiment at CERN

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Journal of Physics: Conference Series