Abstract

Map Reduce pioneered by Google is mainly employed in Big Data analytics. In Map Reduce environment, most of the algorithms are re-used for mining the data. Prediction of execution time and system overhead of MapReduce job is very vital, from which performance shall be ascertained. Cloud computing is widely used as a computing platform in business and academic communities. Performance plays a major role, when user runs an application in the cloud. User may want to estimate the application execution time (latency) before submitting a Task or a Job. Hadoop clusters are deployed on Cloud environment performing the experiment. System overhead is determined by running Map Reduce job over Hadoop Clusters. While performing the experiment, metrics such as network I/O, CPU, Swap utilization, Time to complete the job and RSS, VSZ were captured and evaluated in order to diagnose, how performance of Hadoop is influenced by reconstructing the block size and split size with respect to block size.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call