Abstract

Fault tolerance is of great importance for big data systems. Although several software-based application-level techniques exist for fault security in big data systems, there is a potential research space at the hardware level. Big data needs to be processed inexpensively and efficiently, for which traditional hardware architectures are, although adequate, not optimum for this purpose. In this paper, we propose a hardware-level fault tolerance scheme for big data and cloud computing that can be used with the existing software-level fault tolerance for improving the overall performance of the systems. The proposed scheme uses the concurrent error detection (CED) method to detect hardware-level faults, with the help of Scalable Error Detecting Codes (SEDC) and its checker. SEDC is an all unidirectional error detection (AUED) technique capable of detecting multiple unidirectional errors. The SEDC scheme exploits data segmentation and parallel encoding features for assigning code words. Consequently, the SEDC scheme can be scaled to any binary data length “n” with constant latency and less complexity, compared to other AUED schemes, hence making it a perfect candidate for use in big data processing hardware. We also present a novel area, delay, and power efficient, scalable fault secure checker design based on SEDC. In order to show the effectiveness of our scheme, we (1) compared the cost of hardware-based fault tolerance with an existing software-based fault tolerance technique used in HDFS and (2) compared the performance of the proposed checker in terms of area, speed, and power dissipation with the famous Berger code and m-out-of-2m code checkers. The experimental results show that (1) the proposed SEDC-based hardware-level fault tolerance scheme significantly reduces the average cost associated with software-based fault tolerance in a big data application, and (2) the proposed fault secure checker outperforms the state-of-the-art checkers in terms of area, delay, and power dissipation.

Highlights

  • Big data is promising for business applications and is rapidly increasing as an important segment of the IT industry

  • It is always intended that whenever a fault occurs, the damage done should be within an acceptable threshold rather than beginning the whole task from scratch, due to which fault tolerance becomes an integral part in cloud computing and big data [3]

  • (4) In order to prove the superiority of the fault secure (FS) Scalable Error Detecting Codes (SEDC) checker presented in contrast with state-of-the-art all unidirectional error detection (AUED) checkers, we show that the FS SEDC checker achieves state-of-the-art performance in terms of area, delay, and power dissipation

Read more

Summary

Introduction

Big data is promising for business applications and is rapidly increasing as an important segment of the IT industry. Given the importance of fault tolerance at the HW level in big data and cloud computing applications, in this paper, we present a fault secure (FS) SEDC checker used with SEDC codes [25]. The FS SEDC checker inherits all these features of SEDC codes (i.e., simple scalability, constant latency, and less power dissipation) which suits its implementation in online fault detection in processors, cache memories, and NAND Flash-based memories for big data. (1) We propose HW-level fault tolerance for circuits designed to process big data and cloud computing applications.

Introduction to the Overall System
The FS SEDC Checker
Experiments and Results
Cost Analysis
Conclusions and Future Work
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call