Enhancing Availability and Reliability of Cloud Data through Syncopy

Tsozen Yeh,Huichen Lee

doi:10.1109/ithings.2014.27

Abstract

The Internet of Things (IoT) has shown its promising future recently. Cloud computing can provide the infrastructure for storing and handling the potentially enormous volume of data generated therein. Consequently, the availability and reliability of cloud data will largely affect the success of IoT. Hadoop is a very popular platform adopted in the community of cloud computing. The Hadoop Distributed File System (HDFS) is the default file system in Hadoop. HDFS keeps multiple copies of data files within a Hadoop cluster to avoid losing data. However, this approach still cannot guarantee the availability and reliability of data when fatal disasters, such as fire or earthquakes, destroy the entire Hadoop cluster. As a result, maintaining data backup among different Hadoop clusters is a must to achieve high availability and reliability of cloud data. Currently, distcp is the only tool HDFS provides to duplicate data files among Hadoop clusters installed at different locations. Unfortunately, users need to manually execute distcp, which cannot promise the timely synchronization of duplicated data files among Hadoop clusters. Besides, distcp always transfers the entire contents of data files between Hadoop clusters regardless how small the amount of new data is updated. Obviously, this could waste considerable time and network bandwidth in practice. We designed and implemented an efficient scheme, namely syncopy (synchronous copy), in HDFS to automatically conduct real time synchronization for data files duplicated among different Hadoop clusters. Compared with distcp, our experimental results show that syncopy can reduce the required time by up to 99.20%.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Enhancing Availability and Reliability of Cloud Data through Syncopy

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Enhancing Data Availability through Automatic Replication in the Hadoop Cloud System
Tsozen Yeh ... Yiming Tu
-
Tsozen Yeh, et. al.Tsozen Yeh ... Yiming Tu
01 Dec 2018
01 Dec 2018

Customized Web User Interface for Hadoop Distributed File System
T Lakshmi Siva Rama Krishna ... T Ragunathan
-
T Lakshmi Siva Rama Krishna, et. al.T Lakshmi Siva Rama Krishna ... T Ragunathan
04 Sep 2015
04 Sep 2015

A Cost-Efficient Data Placement Algorithm with High Reliability in Hadoop
Yao Du ... Junzhou Luo
-
Yao Du, et. al.Yao Du ... Junzhou Luo
01 Aug 2017
01 Aug 2017

Design and Implementation of HDPS Scheduler in Hadoop Over Rackspace Cloud Server for Better Management of Data in Heterogeneous Networks
...
International Review on Computers and Software | VOL. 9
, et. al. ...
30 Jun 2014
International Review on Computers and Software | VOL. 9

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Enhancing Availability and Reliability of Cloud Data through Syncopy

Abstract

Talk to us

Similar Papers