Abstract

With the exponential growth of LHC (Large Hadron Collider) data in 2011, and more coming in 2012, distributed computing has become the established way to analyse collider data. The ATLAS grid infrastructure includes almost 100 sites worldwide, ranging from large national computing centers to smaller university clusters. These facilities are used for data reconstruction and simulation, which are centrally managed by the ATLAS production system, and for distributed user analysis. To ensure the smooth operation of such a complex system, regular tests of all sites are necessary to validate the site capability of successfully executing user and production jobs. We report on the development, optimization and results of an automated functional testing suite using the HammerCloud framework. Functional tests are short lightweight applications covering typical user analysis and production schemes, which are periodically submitted to all ATLAS grid sites. Results from those tests are collected and used to evaluate site performances. Sites that fail or are unable to run the tests are automatically excluded from the PanDA brokerage system, therefore avoiding user or production jobs to be sent to problematic sites.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.