Abstract

Distributed graph computing refers to extract knowledge by performing computations on large graphs. If the data source is continuously input like stream, the system is called streaming graph computing. When computing large graphs, a basic and significant step is to distribute the graph over a cluster of nodes, which is called ‘partition’. If the graph isn't partitioned properly, the communication will quickly become a limiting factor in scaling the system, especially in streaming graph computing. And inside some cluster, the CPU speed and memory size of different nodes differs from each other. Observing that in this kind of cluster, nodes those has less resource limit the computing speed, we ask if the partition algorithm could be improved. We propose a simple heuristics to do partition in such cluster and compare the performance of some classic algorithms. It makes less cost of communication more efficient, and make better use of nodes those have more resources. Finally, we evaluate the performance gains in imbalance clusters by using our graph partition method to solve standard PageRank computing on a large real-world World-Wide-Web link graph. It shows that in such circumstance, our heuristics are a significant improvement.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.