Abstract

We present MPI-based parallel algorithms for counting triangles and computing clustering coefficients in massive networks. Counting triangles is important in the analysis of various networks, e.g., social, biological, web etc. Emerging massive networks do not fit in the main memory of a single machine and are very challenging to work with. Our distributed-memory parallel algorithm allows us to deal with such massive networks in a time- and space-efficient manner. We were able to count triangles in a graph with 2 billions of nodes and 50 billions of edges in 10 minutes. Our parallel algorithm for computing clustering coefficients uses efficient external memory aggregation. We also show how edge sparsification technique can be used with our parallel algorithm to find approximate number of triangles without sacrificing the accuracy of estimation. In addition, we propose a simple modification of a state-of-the-art sequential algorithm that improves both runtime and space requirement.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call