Anomaly-based Fault Detection System in Distributed System

Byoung Uk Kim,Salim Hariri

doi:10.1109/sera.2007.55

Abstract

One of the important design criteria for distributed systems and their applications is their reliability and robustness to hardware and software failures. The increase in complexity, inter connectedness, dependency and the asynchronous interactions between the components that include hardware resources (computers, servers, network devices), and software (application services, middleware, web services, etc.) makes the fault detection and tolerance a challenging research problem. In this paper, we present an innovative approach based on statistical and data mining techniques to detect faults (hardware or software) and also identify the source of the fault. In our approach, we monitor and analyze in realtime all the interactions between all the components of a distributed system. We used data mining and supervised learning techniques to obtain the rules that can accurately model the normal interactions among these components. Our anomaly analysis engine will immediately produce an alert whenever one or more of the interaction rules that capture normal operations is violated due to a software or hardware failure. We evaluate the effectiveness of our approach and its performance to detect software faults that we inject asynchronously, and compare the results for different noise level.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Anomaly-based Fault Detection System in Distributed System

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Influence of software and hardware failures with imperfect fault coverage on PONs OPEX
Álvaro Fernández ... Norvald Stol
-
Álvaro Fernández, et. al.Álvaro Fernández ... Norvald Stol
01 May 2015
01 May 2015

Anomaly-based Fault Detection with Interaction Analysis Using State Interface

-

01 Jan 2009
01 Jan 2009

A Reliability Prediction Model for the Relay Protection Device and Its Internal Modules Considering Thermal Effect
Ziyang Jing ... Ancheng Xue
-
Ziyang Jing, et. al.Ziyang Jing ... Ancheng Xue
01 Jan 2023
01 Jan 2023

Flexible Fault Tolerance in Distributed Enterprise Communities
M Ionescu
-
M IonescuM Ionescu
01 Sep 2010
01 Sep 2010

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Anomaly-based Fault Detection System in Distributed System

Abstract

Talk to us

Similar Papers