Abstract

Abstract The big graph era is coming with strong and ever-growing demands on parallel iterative analysis. But, before that, balanced graph partitioning is a fundamental problem and is NP-complete. Till now, there have been several streaming heuristic solutions with a single full scan over the input graph. However, some of them cannot be easily parallelized to further accelerate partitioning for large-scale graphs due to complicated heuristics; while others can be run in parallel but incur expensive communication costs during iterative computation. This paper presents Target-vertex Sensitive Hash (TSH), an easy-to-be distributed partitioning method. We first analyze the locality property naturally provided by the original input graph, which has not yet been considered by existing work. We then exploit such locality to simplify the heuristic rule. The simplified rule is implemented by a two-step framework where target vertices of edges are first logically pre-divided without accessing any graph data and then, based on the distribution of target vertices, streaming partitioning is physically performed in parallel. TSH provides the capability of quickly dividing large-scale graphs because of parallelization, as well as optimizes communication overheads due to the utilization of locality. Using a broad spectrum of real-world graphs, we conduct extensive performance studies to confirm the effectiveness of TSH over up-to-date competitors.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call