Accumulative Computation on MapReduce

Yu Liu ,Kiminori Matsuzaki ,Kento Emoto ,Zhenjiang Hu

doi:10.11185/imt.9.73

Abstract

MapReduce programming model attracts a lot of enthusiasm among both industry and academia, largely because it simplifies the implementations of many data parallel applications. In spite of the simplicity of the program- ming model, there are many applications that are hard to be implemented by MapReduce, due to their innate characters of computational dependency. In this paper we propose a new approach of using the programming pattern accumulate over MapReduce, to handle a large class of problems that cannot be simply divided into independent sub-computations. Using this accumulate pattern, many problems that have computational dependency can be easily expressed, and then the programs will be transformed to MapReduce programs executed on large clusters. Users without much knowledge of MapReduce can also easily write programs in a sequential manner but finally obtain efficient and scalable MapRe- duce programs. We describe the programming interface of our accumulate framework and explain how to transform a user-specified accumulate computation to an efficient MapReduce program. Our experiments and evaluations illustrate the usefulness and efficiency of the framework.

Full Text