Analysis and implementation of reactive fault tolerance techniques in Hadoop: a comparative study

Hassan Asghar,Babar Nazir

doi:10.1007/s11227-020-03491-9

Abstract

Hadoop is a state-of-the-art industry’s de facto tool for the computation of Big Data. Native fault tolerance procedure in Hadoop is dilatory and leads us towards performance degradation. Moreover, it is failed to completely consider the computational overhead and storage cost. On the other hand, the dynamic nature of MapReduce and complexity are also important parameters that affect the response time of the job. To achieve all this, it is essential to have a foolproof failure handling technique. In this paper, we have performed an analysis of notable fault tolerance techniques to see the impact of using different performance metrics under variable dataset with variable fault injections. The critical result shows that response timewise, the byzantine technique has a performance priority over the retrying and checkpointing technique in regards to killing one node failure. In addition, throughput wise, task-level byzantine fault tolerance technique once again had high priority as compared to checkpointing and retrying in terms of network disconnect failure. All in all, this comparative study highlights the strengths and weaknesses of different fault-tolerant techniques and is essential in determining the best technique in a given environment.

Full Text