Abstract

Automated weighted graph construction from massive data is essential to weighted graph theory based data mining processes, where the edge weight computation is time consuming or even fails to complete on a single machine when necessary resources are exhausted. In addition, existing work lacks of the measurement on the accuracy of the edge weights, which represents the graph accuracy and affects the following data mining results. This paper describes the classification, implementation and evaluation of edge weight computation algorithms with MapReduce Framework, which is a powerful parallel and distributed processing model. First, a classification of the edge weight computation algorithms is developed and how they can be applied on MapReduce is also discussed. Then we propose comprehensive measurements on the edge weight accuracy in terms of the number of edges, strength distribution, community structure, Hop-plot and effective diameters. Finally, a performance study has been conducted to evaluate these algorithms in terms of memory and disk usage, execution time and accuracy using a real massive social network application dataset. The results are presented and discussed. Our comparison results can help find out the most effective parallel and distributed edge weight computation algorithm for constructing a weighted graph for a given massive dataset.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call