A Distributed Presence Service over Epidemic Multicast

Peterson Wilges,Regina L.O Moraes,Taisy Silva Weber,Sérgio Luis Cechin

doi:10.4013/jacr.2012.21.05

Abstract

In a dynamic distributed system with a very large number of nodes, such as a cloud, it issometimes useful to discover the nodes that are up in the system at a given time. The number of those nodes changes continually along the operation time, as some nodes crash and some join the system. In this paper we introduce a presence service that was implemented over a gossip structure using an epidemic multicast protocol. Unlike other common presence services, our service is fully distributed. Due to epidemic dissemination and inherent redundancy provided by the multicast protocol, the service is resilient against message loss and link crashes. In a scenario we developed to evaluate the efficiency and scalability of our presence service, we show how presence notifications propagate to reach all nodes in the group and we also show how adjustments for the gossip configuration can benefit the efficiency and resilience of the notification dissemination. The results of the experimental evaluation show that following a distributed approach over epidemic communication leads to a resilient and scalable presence service.Key words: Presence service, epidemic protocols, resilience, fault tolerance, clouds.

Highlights

A presence service aims to manage presence information about computer nodes in a given network
The results show that a distributed approach over epidemic communication leads to an efficient, resilient and scalable presence service
We can conclude that it is advantageous to use a distributed approach associated with epidemic multicast to build a presence service

Summary

Introduction

A presence service aims to manage presence information about computer nodes in a given network. These faults are commonly used to evaluate the fault coverage and resilience of communication protocols (Siqueira et al, 2009) in fault scenarios For both evaluations we simulated a network with a fixed number of nodes and different fan-out configurations.According to this scenario and settings, some test experiments were done. The experiment runs as follows: one node, the seed, sends just one single notification message using epidemic multicast and we collect, through generated logs, the time. This message arrives in all other nodes of the group. The experiment shows, as expected, that the dissemination time of presence information does not grow linearly with the increasing number of network nodes This is because PingCloud uses an epidemic dissemination protocol. The delay was injected to emulate the network’s natural delays, and it cannot be considered as a fault emulation

Related work

Conclusions