Developing a monitoring system for Cloud-based distributed data-centers

Domenico Elia,Marica Antonacci,Gioacchino Vino,Giacinto Donvito,A Forti,M Litmaath,L Betev,O Smirnova,P Hristov

doi:10.1051/epjconf/201921408012

Domenico Elia, Marica Antonacci + Show 7 more

Open Access

https://doi.org/10.1051/epjconf/201921408012

Copy DOI

Abstract

Nowadays more and more datacenters cooperate each others to achieve a common and more complex goal. New advanced functionalities are required to support experts during recovery and managing activities, like anomaly detection and fault pattern recognition. The proposed solution provides an active support to problem solving for datacenter management teams by providing automatically the root-cause of detected anomalies. The project has been developed in Bari using the datacenter ReCaS as testbed. Big Data solutions have been selected to properly handle the complexity and size of the data. Features like open source, big community, horizontal scalability and high availability have been considered and tools belonging to the Hadoop ecosystem have been selected. The collected information is sent to a combination of Apache Flume and Apache Kafka, used as transport layer, in turn delivering data to databases and processing components. Apache Spark has been selected as analysis component. Different kind of databases have been considered in order to satisfy multiple requirements: Hadoop Distributed File System, Neo4j, InfluxDB and Elasticsearch. Grafana and Kibana are used to show data in a dedicated dashboards. The Root-cause analysis engine has been implemented using custom machine learning algorithms. Finally, results are forwarded to experts by email or Slack, using Riemann.

Highlights

Nowadays, data centers are increasing in complexity by utilizing different technologies together in order to accomplish more and more ambitious goals
Not conventional tools are required to monitoring the overall datacenter network and new advanced functionalities are required to support experts during recovery and managing activities, like anomaly detection and fault pattern recognition
Service malfunctions could be detected using the first source category but this information alone does not allow to figure out the root causes

Summary

Introduction

Data centers are increasing in complexity by utilizing different technologies together in order to accomplish more and more ambitious goals. Not conventional tools are required to monitoring the overall datacenter network and new advanced functionalities are required to support experts during recovery and managing activities, like anomaly detection and fault pattern recognition.

Results

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: EPJ Web of Conferences	Publication Date: Jan 1, 2019
Citations: 1	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Developing a monitoring system for Cloud-based distributed data-centers

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: EPJ Web of Conferences

Lead the way for us

Similar Papers

Implementation of change data capture in ETL process for data warehouse using HDFS and apache spark
Denny ... I Putu Medagia Atmaja
-
Denny, et. al. Denny ... I Putu Medagia Atmaja
01 Sep 2017
01 Sep 2017

Locality Sensitive Hashing based incremental clustering for creating affinity groups in Hadoop — HDFS - An infrastructure extension
A Kala Karun ... K Chitharanjan
-
A Kala Karun, et. al.A Kala Karun ... K Chitharanjan
01 Mar 2013
01 Mar 2013

Solution for the future: small file management by optimizing Hadoop
O Achandair ... S Khoulji
International Journal of Engineering & Technology | VOL. 7
O Achandair, et. al.O Achandair ... S Khoulji
11 Mar 2018
International Journal of Engineering & Technology | VOL. 7

Identifying similar sentences when processed by Apache Spark
R S V Mukhesh ... Aashnna Soni
-
R S V Mukhesh, et. al.R S V Mukhesh ... Aashnna Soni
08 Oct 2022
08 Oct 2022

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Developing a monitoring system for Cloud-based distributed data-centers

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: EPJ Web of Conferences