Detection of Erratic Behavior in Load Balanced Clusters of Servers Using a Machine Learning Based Method

Martin Adam,Dagmar Adamová,Martin Pilát,Luca Magnoni,A Forti,L Betev,O Smirnova,M Litmaath,P Hristov

doi:10.1051/epjconf/201921408030

Abstract

With the explosion of the number of distributed applications, a new dynamic server environment emerged grouping servers into clusters, whose utilization depends on the current demand for the application. To provide reliable and smooth services it is crucial to detect and fix possible erratic behavior of individual servers in these clusters. Use of standard techniques for this purpose delivers suboptimal results. We have developed a method based on machine learning techniques which allows detecting outliers indicating a possible problematic situation. The method inspects the performance of the rest of the cluster and provides system operators with additional information which allows them to identify quickly the failing nodes. We applied this method to develop a Spark application using the CERN MONIT architecture and with this application, we analyzed monitoring data from multiple clusters of dedicated servers in the CERN data center. In this contribution, we present our results achieved with this new method and with the Spark application for analytics of CERN monitoring data.

Highlights

In recent years the challenge of handling big volumes of data has triggered an ever growing production of distributed applications
If a suspected anomaly is found in one metric, the administrator needs to compare that to the others and hopefully discover the nature of the problem
Incorporating these metrics in the monitoring systems might be too time-consuming, considering that the lack of skilled administrators often leads to understaffed teams

Summary

Introduction

In recent years the challenge of handling big volumes of data has triggered an ever growing production of distributed applications. In particular noticing errors leading to performance degradation and potential failures can be difficult, let alone diagnosing problems and tracing them to a specific node or a set of nodes. When done manually, these procedures require experts to look through stacks of charts usually depicting multiple metrics per server. In an attempt to simplify administrators work, many applications offer a set of internal metrics describing their performance Incorporating these metrics in the monitoring systems might be too time-consuming, considering that the lack of skilled administrators often leads to understaffed teams. We discuss the efficiency of such an approach and present plans for future improvements

Distributed Applications at CERN

Preparation of the Input Data for Algorithms

Analyzing the Data

Conclusion and Future Work

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Detection of Erratic Behavior in Load Balanced Clusters of Servers Using a Machine Learning Based Method

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: EPJ Web of Conferences

Lead the way for us

Journal: EPJ Web of Conferences	Publication Date: Jan 1, 2019
License type: CC BY 4.0

Similar Papers

Erratic server behavior detection using machine learning on streams of monitoring data
Martin Adam ...
EPJ Web of Conferences | VOL. 245
Martin Adam, et. al.Martin Adam ...
01 Jan 2020
EPJ Web of Conferences | VOL. 245

Semantic speech analysis using machine learning and deep learning techniques: a comprehensive review
Suryakant Tyagi ... Sándor Szénási
Multimedia Tools and Applications | VOL. 83
Suryakant Tyagi, et. al.Suryakant Tyagi ... Sándor Szénási
19 Dec 2023
Multimedia Tools and Applications | VOL. 83

Evaluating execution time predictions on GPU kernels using an analytical model and machine learning techniques
Marcos Amaris ... Denis Trystram
Journal of Parallel and Distributed Computing | VOL. 171
Marcos Amaris, et. al.Marcos Amaris ... Denis Trystram
13 Sep 2022
Journal of Parallel and Distributed Computing | VOL. 171

Review of Machine and Deep Learning Techniques in Epileptic Seizure Detection using Physiological Signals and Sentiment Analysis
Deba Prasad Dash ... Mohammad R Khosravi
ACM Transactions on Asian and Low-Resource Language Information Processing | VOL. 23
Deba Prasad Dash, et. al.Deba Prasad Dash ... Mohammad R Khosravi
15 Jan 2024
ACM Transactions on Asian and Low-Resource Language Information Processing | VOL. 23

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Detection of Erratic Behavior in Load Balanced Clusters of Servers Using a Machine Learning Based Method

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: EPJ Web of Conferences