Abstract

Recently Cloud based Hadoop has gained a lot of interest that offer ready to use Hadoop cluster environment for processing of Big Data, eliminating the operational challenges of on-site hardware investment, IT support, and installing, configuring of Hadoop components such as HDFS and MapReduce. On demand Hadoop as a service helps the industries to focus on business growth and based on pay per use model for Big Data processing with auto-scaling of Hadoop cluster feature. In this paper implementation of various MapReduce jobs like Pi, TeraSort, WordCount has been done on cloud based Hadoop deployment by using Microsoft Azure cloud services. Performance of MapReduce jobs has been evaluated with respect to CPU execution time with varying size of Hadoop cluster. From the experimental result, it is found that CPU execution time to finish the jobs decrease as the number of Data Nodes in HDInsight cluster increases and indicates the good response time with increase in performance as well as more customer satisfaction.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call