Abstract

The term “replication” refers to the practice of keeping identical copies of data on multiple systems. Any distributed file system must have replication as one of its design requirements (DRS). For storing and processing data, Hadoop DFS is now used by the majority of academia and industry. The Hadoop distributed file system (HDFS) is a system for storage and processing huge amounts of information. In HDFS, inefficient replication is the main issue that contributes to a file system's performance degradation. Because of the system's robustness and dynamic features, the quantity of applications depending on Hadoop is rapidly growing. The HDFS, which is at the heart of Apache Hadoop, uses a static replication strategy to provide computation with reliability, scalability, and high availability. The admittance rate for each file in HDFS, however, is completely special due to the uniqueness of similar functions on various layers. As a result, using the similar replicating method for each file may have negative performance consequences. This paper proposes a method for dynamically replicating information files depending on predictive analysis after carefully considering the drawbacks of HDFS architecture. To address this issue, we propose an proficient data replication method in the Hadoop framework that increases the availability.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.