Abstract

Over the past two years, the operation of the CERN Data Centres went through significant changes with the introduction of new mechanisms for hardware procurement, new services for cloud provisioning and configuration management, among other improvements. These changes resulted in an increase of resources being operated in a more dynamic environment. Today, the CERN Data Centres provide over 11000 multi-core processor servers, 130 PB disk servers, 100 PB tape robots, and 150 high performance tape drives. To cope with these developments, an evolution of the data centre monitoring tools was also required. This modernisation was based on a number of guiding rules: sustain the increase of resources, adapt to the new dynamic nature of the data centres, make monitoring data easier to share, give more flexibility to Service Managers on how they publish and consume monitoring metrics and logs, establish a common repository of monitoring data, optimise the handling of monitoring notifications, and replace the previous toolset by new open source technologies with large adoption and community support. This contribution describes how these improvements were delivered, present the architecture and technologies of the new monitoring tools, and review the experience of its production deployment.

Highlights

  • The deployment of new tools and workflows to manage CERN Data Centres in the areas of procurement, installation, provisioning, and configuration lead to a significant increase in the number of resources to manage

  • When bringing these requests together it became clear that the old monitoring tools (Lemon[1] and SLS[2]) should be replaced: they could not scale to the current needs, the code was old and difficult to maintain, and provided limited functionality

  • GNI For the alerts layer we have developed the General Notification Infrastructure (GNI)

Read more

Summary

Monitoring Evolution at CERN

This content has been downloaded from IOPscience. Please scroll down to see the full text. Ser. 664 052002 (http://iopscience.iop.org/1742-6596/664/5/052002) View the table of contents for this issue, or go to the journal homepage for more. Download details: IP Address: 137.138.93.202 This content was downloaded on 09/03/2016 at 08:33 Please note that terms and conditions apply. 21st International Conference on Computing in High Energy and Nuclear Physics (CHEP2015) IOP Publishing. Journal of Physics: Conference Series 664 (2015) 052002 doi:10.1088/1742-6596/664/5/052002

Introduction
Streaming to programmatically process monitoring data
Batch Accounting
Component Flume HDFS ElasticSearch GNI
Findings
Conclusions
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.