GraphA: Adaptive Partitioning for Natural Graphs

Dongsheng Li,Yiming Zhang,Zhaoning Zhang,Jinyan Wang,Chengfei Zhang

doi:10.1109/icdcs.2017.37

Abstract

Large-scale graph computation is central to applications ranging from language processing to social networks. However, natural graphs tend to have skewed power-law distributions where a small subset of the vertices have a large number of neighbors. Existing graph-parallel systems suffer from load imbalance, high communication cost, or suboptimal and complex processing. In this paper we present GraphA, an Adaptive approach to efficient partitioning and computation of large-scale natural graphs. GraphA provides an adaptive and uniform graph partitioning algorithm, which partitions the datasets in a load-balanced manner by using an incremental number of hash functions. We have implemented GraphA both on Spark and on GraphLab. Extensive evaluation shows that GraphA remarkably outperforms state-of-the-art graph-parallel systems (GraphX and PowerLyra) in ingress time, execution time and storage overhead, for both real-worldand synthetic graphs.

Full Text