An efficient replication management system for HDFS management

Korla Swaroopa,A Satya Phani Kumari,T Pavan Kumar,Nikita Manne,Rabinarayan Satpathy

doi:10.1016/j.matpr.2021.07.041

Abstract

The term “replication” refers to the practice of keeping identical copies of data on multiple systems. Any distributed file system must have replication as one of its design requirements (DRS). For storing and processing data, Hadoop DFS is now used by the majority of academia and industry. The Hadoop distributed file system (HDFS) is a system for storage and processing huge amounts of information. In HDFS, inefficient replication is the main issue that contributes to a file system's performance degradation. Because of the system's robustness and dynamic features, the quantity of applications depending on Hadoop is rapidly growing. The HDFS, which is at the heart of Apache Hadoop, uses a static replication strategy to provide computation with reliability, scalability, and high availability. The admittance rate for each file in HDFS, however, is completely special due to the uniqueness of similar functions on various layers. As a result, using the similar replicating method for each file may have negative performance consequences. This paper proposes a method for dynamically replicating information files depending on predictive analysis after carefully considering the drawbacks of HDFS architecture. To address this issue, we propose an proficient data replication method in the Hadoop framework that increases the availability.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

An efficient replication management system for HDFS management

Abstract

Talk to us

Similar Papers

More From: Materials Today: Proceedings

Lead the way for us

Journal: Materials Today: Proceedings	Publication Date: Jul 17, 2021
Citations: 1

Similar Papers

HybridFS — A High Performance and Balanced File System Framework with Multiple Distributed File Systems
Lidong Zhang ... Tse-Chuan Hsu
-
Lidong Zhang, et. al.Lidong Zhang ... Tse-Chuan Hsu
01 Jul 2017
01 Jul 2017

A Scalable Cloud Platform using Matlab Distributed Computing Server Integrated with HDFS
Rahul Dutta ... B Annappa
-
Rahul Dutta, et. al.Rahul Dutta ... B Annappa
01 Dec 2012
01 Dec 2012

Locality Sensitive Hashing based incremental clustering for creating affinity groups in Hadoop — HDFS - An infrastructure extension
A Kala Karun ... K Chitharanjan
-
A Kala Karun, et. al.A Kala Karun ... K Chitharanjan
01 Mar 2013
01 Mar 2013

From Backup to Hot Standby: High Availability for HDFS
Andre Oriani ... Islene C Garcia
-
Andre Oriani, et. al.Andre Oriani ... Islene C Garcia
01 Oct 2012
01 Oct 2012

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

An efficient replication management system for HDFS management

Abstract

Talk to us

Similar Papers

More From: Materials Today: Proceedings