Abstract

MapReduce is a parallel programming model and an associated implementation introduced by Google. In the programming model, a user specifies the computation by two functions, Map and Reduce. The underlying MapReduce library automatically parallelizes the computation, and handles complicated issues like data distribution, load balancing and fault tolerance. Massive input, spread across many machines, need to parallelize. Moves the data, and provides scheduling, fault tolerance. The original MapReduce implementation by Google, as well as its open-source counterpart, Hadoop, is aimed for parallelizing computing in large clusters of commodity machines. Map Reduce has gained a great popularity as it gracefully and automatically achieves fault tolerance. It automatically handles the gathering of results across the multiple nodes and returns a single result or set. This paper gives an overview of MapReduce programming model and its applications. The author has described here the workflow of MapReduce process. Some important issues, like fault tolerance, are studied in more detail. Even the illustration of working of Map Reduce is given. The data locality issue in heterogeneous environments can noticeably reduce the Map Reduce performance. In this paper, the author has addressed the illustration of data across nodes in a way that each node has a balanced data processing load stored in a parallel manner. Given a data intensive application running on a Hadoop Map Reduce cluster, the auhor has exemplified how data placement is done in Hadoop architecture and the role of Map Reduce in the Hadoop Architecture. The amount of data stored in each node to achieve improved data-processing performance is explained here.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.