An Information Aggregation and Analytics System for ATLAS Frontier

Andrea Formica,Millissa Si Amer,Elizabeth J Gallas,Ilija Vukotic,Nurcan Ozturk,Julio Lozano Bahilo

doi:10.1051/epjconf/202024504032

Abstract

ATLAS event processing requires access to centralized database systems where information about calibrations, detector status and data-taking conditions are stored. This processing is done on more than 150 computing sites on a world-wide computing grid which are able to access the database using the Squid-Frontier system. Some processing workflows have been found which overload the Frontier system due to the Conditions data model currently in use, specifically because some of the Conditions data requests have been found to have a low caching efficiency. The underlying cause is that non-identical requests as far as the caching are actually retrieving a much smaller number of unique payloads. While ATLAS is undertaking an adiabatic transition during the LHC Long Shutdown 2 and Run 3 from the current COOL Conditions data model to a new data model called CREST for Run 4, it is important to identify the problematic Conditions queries with low caching efficiency and work with the detector subsystems to improve the storage of such data within the current data model. For this purpose ATLAS put together an information aggregation and analytics system. The system is based on aggregated data from the Squid-Frontier logs using the Elasticsearch technology. This paper§ describes the components of this analytics system from the server based on Flask/Celery application to the user interface and how we use Spark SQL functionalities to filter data for making plots, storing the caching efficiency results into a Elasticsearch database and finally deploying the package via a Docker container.

Highlights

In the ATLAS experiment [1] at the LHC these data are stored in relational DB (Oracle), using a model based on the LCG Conditions database infrastructure and the COOL API, both developed mainly by CERN IT [2]
The user interface of the service allows the input data to be filtered and the Parquet files [10] to be prepared based on the task IDs that are selected from those problematic workflows
We visualize the data in the form of plots in four categories: 1. The count of cached and not-cached queries per database instances, the type of the notcached queries, the count of queries per COOL data schema, the percentage of cached vs. not-cached queries per COOL data schema and the percentage of the queries per node for a given COOL data schema

Summary

Introduction

In HEP experiments we use the term Conditions data to refer to non-event data representing the detector status (e.g. calibrations and alignments, data taking conditions and similar). These data are essential for the processing of physics data, in order to reconstruct events optimally and to exploit the full detector’s potential. In the ATLAS experiment [1] at the LHC these data are stored in relational DB (Oracle), using a model based on the LCG Conditions database infrastructure and the COOL API, both developed mainly by CERN IT [2]

Access to Conditions data in ATLAS

Architecture

Results

Conclusions

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

An Information Aggregation and Analytics System for ATLAS Frontier

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: EPJ Web of Conferences

Lead the way for us

Journal: EPJ Web of Conferences	Publication Date: Jan 1, 2020
License type: CC BY 4.0

Similar Papers

AttackDefense Framework (ADF): Enhancing IoT Devices and Lifecycles Threat Modeling
Tommaso Sacchetti ... Daniele Antonioli
ACM Transactions on Embedded Computing Systems | VOL. -
Tommaso Sacchetti, et. al.Tommaso Sacchetti ... Daniele Antonioli
08 Oct 2024
ACM Transactions on Embedded Computing Systems | VOL. -

DEVELOPING AND TESTING A 3D CADASTRAL DATA MODEL A CASE STUDY IN AUSTRALIA
A Aien ... I P Williamson
ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences | VOL. I-4
A Aien, et. al.A Aien ... I P Williamson
16 Jul 2012
ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences | VOL. I-4

BIM data model requirements for asset monitoring and the circular economy
Juan Manuel Davila Delgado ... Lukumon O Oyedele
Journal of Engineering, Design and Technology | VOL. 18
Juan Manuel Davila Delgado, et. al.Juan Manuel Davila Delgado ... Lukumon O Oyedele
13 Apr 2020
Journal of Engineering, Design and Technology | VOL. 18

Making object-oriented schemas more expressive
Diego Calvanese ... Maurizio Lenzerini
-
Diego Calvanese, et. al.Diego Calvanese ... Maurizio Lenzerini
01 Jan 1993
01 Jan 1993

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

An Information Aggregation and Analytics System for ATLAS Frontier

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: EPJ Web of Conferences