Abstract
Apache Hadoop is a software framework delivered by the open basis communal. This is supportive in storing and processing of data sets of bulky scale on clusters of commodity hardware. HDFS (Hadoop Distributed File System) is a principal distributed storage used by the Hadoop applications. An HDFS cluster mainly is made up of a NameNode and the DataNode. The NameNode accomplishes the file system metadata and DataNodes procedure to store the actual data. Hadoop is ascendable, fault tolerant, and very simple to increase. NameNode frequently converts bottleneck, particularly when handling huge number of minor files. To maximize proficiency, NameNode stores the complete metadata of HDFS in the core memory. With too several small files, NameNode can be run out of memory. In this paper, we present a solution used by numerous NameNode. Our explanation has topmost returns than existing one: we implement a system for load balancing, NameNode bottleneck problem solution and time requirements are reduced average in read and write.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have