Abstract

As a powerful distributed data processing mechanism, MapReduce supports abundant parallel applications that process massive data on computer clusters. To process the massive data efficiently and correctly, a rational design for the MapReduce procedure is desired. An irrational MapReduce procedure can cause great waste of computing resources and even paralyze the execution system. With the wide application of MapReduce, the unavoidable drawback of irrational MapReduce procedures becomes increasingly serious. To solve this problem, a method for verifying the rationality of a MapReduce procedure before executing it on a computer cluster is proposed. This method constructs the rationality criteria for MapReduce, and then studies an automatic approach for modelling MapReduce with an executable model object Petri net (OPN). Finally, the approaches for automatically identifying the rationality criteria by analyzing the consequence of model execution is developed. The results from extensive case studies demonstrate that the proposed method is feasible and effective.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call