Abstract

Graph algorithms are used to implement data mining tasks on graph data-sets. Besides conducting the algorithms by the default deterministic manner, some graph processing frameworks, especially those supporting asynchronous execution model, provide interfaces for the algorithms to be executed in nondeterministic manner, which can improve the scalability and performance of the algorithm's executions. However, is the graph algorithm eligible for nondeterministic execution, and will the execution produce expected results? The literature gives few answers to these questions. In this paper, we study the nondeterministic execution of graph algorithms by considering the scenario where data dependences happen in the edges in graph processing frameworks that employ asynchronous execution model. Our study reveals that only by guaranteeing the atomicity of individual reads and writes, some algorithms (e.g., Graph traversal algorithms) can converge by recovering from corrupted intermediate results with nondeterministic execution, and thus tolerate even write-write conflicts, while some other algorithms (e.g., Fixed point iteration algorithms) can converge but tolerate only read-write conflicts. By conducting graph algorithms on real-world graphs in Graph Chi, and comparing their performances and results with deterministic executions, we find that their performance gains are generally scalable to the available processors with nondeterministic executions, and the results at convergence of fixed point iteration algorithms from nondeterministic executions exhibit larger variances from one run to another than their deterministic executions.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call