Addressing Name Node Scalability Issue in Hadoop Distributed File System Using Cache Approach

Debajyoti Mukhopadhyay,Pranav Gadekar,Devesh Maru,Chetan Agrawal,Pooja Yedale

doi:10.1109/icit.2014.18

Debajyoti Mukhopadhyay, Pranav Gadekar + Show 3 more

Open Access

https://doi.org/10.1109/icit.2014.18

Copy DOI

Abstract

Hadoop is a distributed batch processing infrastructure which is currently being used for big data management. The foundation of Hadoop consists of Hadoop Distributed File System (HDFS). HDFS presents a client-server architecture comprised of a Name Node and many Data Nodes. The Name Node stores the metadata for the Data Nodes and Data Node stores application data. The Name Node holds file system metadata in memory, and thus the limit to the number of files in a file system is governed by the amount of memory on the Name Node. Thus when the memory on Name Node is full there is no further chance of increasing the cluster capacity. In this paper we have used the concept of cache memory for handling the issue of Name Node scalability. The focus of this paper is to highlight our approach that tries to enhance the current architecture and ensure that Name Node does not reach its threshold value soon.

Full Text