Abstract

Big data analytics based data exploration and utilization holds immense prospects for the future of businesses. However, as the name suggests, processing such a huge amount of data is challenging. Hadoop with its parallel processing solutions, assists in processing big data in reasonable time. The heart of Hadoop is its distributed File System; and indeed how data is placed in the file system dictates the speed of the data processing. Hence, over the years efficient data placement algorithms has been one of the key research area in big data analytics. Evaluation of such algorithms traditionally requires deploying HDFS on hardware clusters and implementing the data placement algorithm on it. It is often difficult for researchers to acquire required hardware and build a hardware clusters. Even when such clusters are available, scalability becomes an issue. Moreover, real life data center like cluster is not available to many researchers. Simulation provides low cost alternative to evaluation of big data placement algorithms on HDFS. One of the key metrices that is optimized in data placement algorithms is to minimize communication costs and latency. Thus a network simulation based simulation framework would fit the role perfectly. NS3 is one of the most prominent network simulation tool available for researchers. However, full HDFS support for data placement research is still not implemented. This work proposes to extend the NS3 simulation environment for HDFS support and eventual use for data placement algorithm evaluation.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.