Abstract

Based on the MapReduce model and Hadoop Distributed File System (HDFS), Hadoop enables the distributed processing of large data sets across clusters with scalability and fault tolerance. Many data-intensive applications involve continuous and incremental updates of data. Understanding the scalability and cost of a Hadoop platform to handle small and independent updates of data sets sheds light on the design of scalable and cost-effective data-intensive applications. In this chapter, we introduce a motivating movie recommendation application implemented in the MapReduce model and deployed on Amazon Elastic MapReduce (EMR), a Hadoop CONTENTS 2.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call