Abstract

HDFS (Hadoop Distributed File System) is intended to store huge dataset values with accurate file location with high reliability and data streaming to the application is done at high bandwidth. HDFS deals with high fault tolerance with the use of replication of data. Many researches have been done on fitting the data in the exact location. The problem occurred in Hadoop distributed file system is difficulty with information space for storage. It is the most complicated problem which reduce the performance of the file system. To overcome this issue the proposed system aims to have the better data replication which depends upon the access count estimation in Hadoop framework. The proposed system creates the better replicas and to solve the data locality problem with improved arrangement of data replicas and to assign the task for efficient workers to complete the Map Reduce task to obtain the better results. By comparison with the existing system, the proposed system performs the better replication and solves the data locality problem. An experiment has been performed for evaluating the proposed technique with default technique and previously used replication techniques using a benchmark. With respect to the results obtained the proposed method obtained a better throughput when compared to the previous techniques.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.