Abstract
The ATLAS trigger and data acquisition online farm is composed of nearly 3,000 computing nodes, with various configurations, functions and requirements. Maintaining such a cluster is a big challenge from the computer administration point of view, thus various tools have been adopted by the System Administration team to help manage the farm efficiently. In particular, a custom central configuration system, ConfDBv2, was developed for the overall farm management. The majority of the systems are network booted, and are running an operating system image provided by a Local File Server (LFS) via the local area network (LAN). This method guarantees the uniformity of the system and allows, in case of issues, very fast recovery of the local disks which could be used as scratch area. It also provides greater flexibility as the nodes can be reconfigured and restarted with a different operating system in a very timely manner. A user-friendly web interface offers a quick overview of the current farm configuration and status, allowing changes to be applied on selected subsets or on the whole farm in an efficient and consistent manner. Also, various actions that would otherwise be time consuming and error prone can be quickly and safely executed. We describe the design, functionality and performance of this system and its web–based interface, including its integration with other CERN and ATLAS databases and with the monitoring infrastructure.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.