Novel Crash Recovery Approach for Concurrent Failures in Cluster Federation

Bidyut Gupta,Shahram Rahimi

doi:10.1007/978-3-642-01671-4_39

Abstract

In this paper, we have proposed a simple and efficient approach for check pointing and recovery in cluster computing environment. The recovery scheme deals with both orphan and lost intra and inter cluster messages. This check pointing scheme ensures that after the system recovers from failures, all processes in different clusters can restart from their respective recent checkpoints; thus avoiding any domino effect. That is, the recent check points always form a consistent recovery line of the cluster federation. The main features of our work are: it uses selective message logging which enables the initiator process in each cluster to log the minimum number of messages, the recovery scheme is domino effect free and is executed simultaneously by all clusters in the cluster federation, it considers concurrent failures, message complexities in each cluster for both check pointing and recovery schemes are just O(n), where n is the number of processes in a cluster.These features make our algorithm superior to the existing works.KeywordsControl MessageRequest MessageRecovery SchemeInitiator ProcessMessage ComplexityThese keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Novel Crash Recovery Approach for Concurrent Failures in Cluster Federation

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Fast Proactive Recovery from Concurrent Failures
A F Hansen ... T Cicic
-
A F Hansen, et. al.A F Hansen ... T Cicic
01 Jun 2007
01 Jun 2007

A low-overhead recovery approach for distributed computing environment
Bidyut Gupta ... Shahram Rahimi
-
Bidyut Gupta, et. al.Bidyut Gupta ... Shahram Rahimi
01 Jul 2012
01 Jul 2012

Log Based Recovery with Low Overhead for Mobile Computing Systems
Awadhesh Kumar Singh ... Parmeet Kaur
-
Awadhesh Kumar Singh, et. al.Awadhesh Kumar Singh ... Parmeet Kaur
01 Jan 2010
01 Jan 2010

Parallel clustering of high dimensional data by integrating multi-objective genetic algorithm with divide and conquer
Tansel Özyer ... Reda Alhajj
Applied Intelligence | VOL. 31
Tansel Özyer, et. al.Tansel Özyer ... Reda Alhajj
14 May 2008
Applied Intelligence | VOL. 31

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Novel Crash Recovery Approach for Concurrent Failures in Cluster Federation

Abstract

Talk to us

Similar Papers