Abstract

Artificial Intelligence for IT Operations (AIOps) describes the process of maintaining and operating large IT infrastructures using AI-supported methods and tools on different levels. This includes automated anomaly detection and root cause analysis, remediation and optimization, as well as fully automated initiation of self-stabilizing activities. While the automation is mandatory due to the system complexity and the criticality of QoS-bounded responses, the measures compiled and deployed by the AI-controlled administration are not easily understandable or reproducible in all cases. Therefore, explainable actions taken by the automated systems are becoming a regulatory requirement for future IT infrastructures. In this paper we present a developed and deployed system named ZerOps as an example for the design of the corresponding architecture, tools, and methods. This system uses deep learning models and data analytics of monitoring data to detect and remediate anomalies.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.