Abstract

is an explosion in the volume of data in the world. The amount of data is increasing by leaps and bounds. The sources are individuals, social media, organizations, etc. The data may be structured, semi-structured or unstructured. Gaining knowledge from this data and using it for competitive advantage is the primary focus of all the organizations. In the last few years Big Data has found its way in almost every field, from government to private sectors, industry to academia. The major challenges associated with Big Data are data organization, modeling, data analysis and retrieval. Hadoop is a widely used software framework used for the large scale management and analysis of data. The main components of Hadoop: HDFS and MapReduce, enable the distributed storage and processing of data over a large number of commodity servers. This paper provides an overview of MapReduce and its capabilities and discusses the related issues.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call