Abstract

The ATLAS model for remote access to database resident information relies upon a limited set of dedicated and distributed Oracle database repositories complemented with the deployment of Frontier system infrastructure on the WLCG (Worldwide LHC Computing Grid). ATLAS clients with network access can get the database information they need dynamically by submitting requests to a Squid proxy cache server in the Frontier network which provides results from its cache or passes new requests along the network to launchpads co-located at one of the Oracle sites (the master Oracle database at CERN or one of the Tier 1 Oracle database replicas). Since the beginning of LHC Run 1, the system has evolved in terms of client, Squid, and launchpad optimizations but the distribution model has remained fundamentally unchanged. On the whole, the system has been broadly successful in providing data to clients with relatively few disruptions even while site databases were down due to overall redundancy. At the same time, its quantitative performance characteristics, such as the global throughput of the system, the load distribution between sites, and the constituent interactions that make up the whole, were largely unknown. But more recently, information has been collected from launchpad and Squid logs into an Elasticsearch repository which has enabled a wide variety of studies of various aspects of the system. This contribution*** will describe dedicated studies of the data collected in Elasticsearch over the previous year to evaluate the efficacy of the distribution model. Specifically, we will quantify any advantages that the redundancy of the system offers as well as related aspects such as the geographical dependence of wait times seen by clients in getting a response to its requests. These studies are essential so that during LS2 (the long shutdown between LHC Run 2 and Run 3), we can adapt the system in preparation for the expected increase in the system load in the ramp up to Run 3 operations.

Highlights

  • Modern particle physics experiments store data in a combination of file systems and database systems

  • With the main bottleneck being in the conditions database area (COOL), late in Run 2 we met with an expert panel to review the system to agree on plans for Run 3 operations and Run 4 evolution: It was concluded that ATLAS must continue to use COOL for Run 3 generally but take action to modify areas of problematic COOL storage by some subsystems

  • We have described the evolution of the distribution of database-related data and associated files in the ATLAS experiment since the beginning of LHC Run 1

Read more

Summary

Introduction

Modern particle physics experiments store data in a combination of file systems and database systems. ATLAS [1] is no different in this respect, but the scale of the data stored and processed is exceptionally high so the majority of this processing, by necessity, takes place across a world-wide computing grid. These data processing tasks require information from centralized databases at CERN such as configuration and control data, detector geometry, as well as,. Another case for database schema replication arises in the area of metadata.

ATLAS Databases and replication use cases
Evolution of schema replication
Evolution of conditions file distribution
DB Release files
AMI replication to CERN
Outlook and Future Direction
Findings
Summary and Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.