Abstract

ABSTRACTIn the big data environment, MapReduce could be adopted to improve the efficiency of iterative algorithm on massive data through running the iterative algorithm on larger PC-cluster. However, it is inefficient if the entire data has to be re-iterated when new data is introduced. In this paper, the incremental iterative computing model (I2M) based on the incremental data and original iterative results is proposed. Then, the MapReduce and I2M based descendant query, PageRank, and K-means, are enumerated. Finally, incremental iterative computing framework (I2F) is implemented by extending HaLoop to support incremental iterative computing. A series of test cases are designed to evaluate I2F on functionality, performance, and cost of incremental iteration. The incremental iterative model proposed in this paper can adapt many iterative algorithms, and promotes the application and optimization of iterative algorithm in the big data environment.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call