Availability Of JobTracker Machine In Hadoop/MapReduce Zookeeper Coordinated Clusters

Ekpe Okorafor

doi:10.5121/acij.2012.3302

Abstract

It is difficult to use the traditional Message Passing Interface (MPI) approach to implement synchronization, coordination, and prevent deadlocks in distributed systems. This difficulty is lessened by the use of Apache's Hadoop/MapReduce and Zookeeper to provide Fault Tolerance in a Homogeneously Distributed Hardware/Software environment. A mathematical model for the availability of the JobTracker in Hadoop/MapReduce using Zookeeper's Leader Election Service is presented in this paper. Although the availability is less than what is expected in f+1 Fault Tolerance systems for crash failures, this approach makes coordination and synchronization easy, reduces the effect of Byzantine faults and provides Fault Tolerance for distributed systems. The results obtained show that the availability changes with change in the number of Zookeeper servers. This model can help determine how many servers are optimal for high availability, from which vendor they must be purchased, and when to use a Zookeeper coordinated Hadoop cluster to perform safety critical tasks.

Full Text