A QoS-configurable failure detection service for internet applications

Rogério C Turchetti,Pierre Sens,Luciana Arantes,Elias P Duarte

doi:10.1186/s13174-016-0051-y

Abstract

Unreliable failure detectors are a basic building block of reliable distributed systems. Failure detectors are used to monitor processes of any application and provide process state information. This work presents an Internet Failure Detector Service (IFDS) for processes running in the Internet on multiple autonomous systems. The failure detection service is adaptive, and can be easily integrated into applications that require configurable QoS guarantees. The service is based on monitors which are capable of providing global process state information through a SNMP MIB. Monitors at different networks communicate across the Internet using Web Services. The system was implemented and evaluated for monitored processes running both on single LAN and on PlanetLab. Experimental results are presented, showing the performance of the detector, in particular the advantages of using the self-tuning strategies to address the requirements of multiple concurrent applications running on a dynamic environment.

Highlights

Consensus [1] and other equivalent problems, such as atomic broadcast and group membership are used to implement dependable distributed systems [2, 3]
In this work we describe an Internet Failure Detection Service (IFDS) that can be used by applications that consist of processes running on independent autonomous systems of the Internet
The main purpose of Step1 is to compute an upper bound for the heartbeat interval ηmax, given the input values provided by the application: the upper bound on the detection time (TDU ), upper bound on the average mistake duration (TMU ), and lower bound on the average mistake recurrence time (TML R)

Summary

Introduction

Consensus [1] and other equivalent problems, such as atomic broadcast and group membership are used to implement dependable distributed systems [2, 3]. The user must provide a specification of the average mistake recurrence time (TMR defined above) and the minimum coverage (CL), which corresponds to a lower bound on the probability that heartbeat messages are received before the timeout interval expires These parameters are used by a configurator which relies on another system called Adaptare that is a middleware that computes the timeout by estimating distributions based on the stochastic properties of the system on which the failure detector is running. With respect to previous SNMP-based implementations of failure detectors, the major benefit of our proposed service is that it allows the user to specify QoS requirements for each application that is monitored, including: the failure detection time, mistake recurrence time and mistake duration Given this input, and the perceived network conditions, our service configures and continuously adapt the failure detector parameters, including the heartbeat rate.

Configurig the failure detector service based on QoS parameters

QoS configuration for multiple applications

Experimental results

Conclusion

Full Text

Published Version (Free)

View/Download pdf

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Journal of Internet Services and Applications	Publication Date: Sep 26, 2016
Citations: 13	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

A QoS-configurable failure detection service for internet applications

Abstract

Highlights

Summary

Published Version (Free)

Talk to us

Similar Papers

More From: Journal of Internet Services and Applications

Lead the way for us

Similar Papers

A Failure Detection Service for Internet-Based Multi-AS Distributed Systems
Dionei M Moraes ... Elias P Duarte Jr
-
Dionei M Moraes, et. al.Dionei M Moraes ... Elias P Duarte Jr
01 Dec 2011
01 Dec 2011

About the Relationship between Election Problem and Failure Detector in Asynchronous Distributed Systems
Sung-Hoon Park
-
Sung-Hoon ParkSung-Hoon Park
01 Jan 2003
01 Jan 2003

A necessary and sufficient condition for transforming limited accuracy failure detectors
E Anceaume ... M Raynal
Journal of Computer and System Sciences | VOL. 68
E Anceaume, et. al.E Anceaume ... M Raynal
11 Dec 2003
Journal of Computer and System Sciences | VOL. 68

The Weakest Failure Detector for Solving Election Problems in Asynchronous Distributed Systems
Sung-Hoon Park
-
Sung-Hoon ParkSung-Hoon Park
01 Jan 2002
01 Jan 2002

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A QoS-configurable failure detection service for internet applications

Abstract

Highlights

Summary

Published Version (Free)

Talk to us

Similar Papers

More From: Journal of Internet Services and Applications