Abstract

The National Synchrotron Radiation Laboratory (NSRL) facility cluster is a collection of user facilities developed by NSRL, including the Hefei Light Source-II (HLS-II), Tunable Infrared Laser for Fundamental of Energy Chemistry (FELiChEM), and THz near-field high-flux material physical property test system (NFTHZ). User facilities generally have high operational availability requirements. The NSRL facility cluster relies on the control infrastructure to provide computing, network, and storage resources, as well as various services that must be available 24 hours a day, 7 days a week. The monitoring system is responsible for tracking the operational status of the control infrastructure, gathering information on faults, performance degradation and cybersecurity issues, and distributing alarm messages in time. It facilitates the operator troubleshooting problems efficiently to improve the availability of the user facilities. The monitoring system is developed by integrating several free and open source software tools. Zabbix is selected as the monitoring tool and collects metrics data from the control infrastructure. Three upper-layer applications are developed for data visualization. The dashboard shows the operational status of the network and various devices. The alarm system collects and distributes alarm messages via web-based GUI and WeChat. The reporting system periodically generates metrics and alarm reports. The monitoring system has been deployed since March 2022. The results indicate that the monitoring system can effectively identify hidden hazards in the control infrastructure.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call