Abstract

Abstract A large number of specialized graph processing systems have been developed to cope with the increasing demand of graph analytics. Most of them require users to deploy a new framework in the cluster for graph processing and switch to other systems to execute non-graph algorithms. This increases the complexity of cluster management and results in unnecessary data movement and duplication. In this paper, we propose our graph processing engine, named epiCG, which is built on top of epiC, an elastic data processing system. The core of epiCG is a new unit called GraphUnit, which is able to not only perform iterative graph processing efficiently, but also collaborate with other types of units to accomplish any complex/multi-stage data analytics. epiCG supports both edge-cut and vertex-cut partitioning methods, and for the latter method, we propose a novel light-weight greedy strategy that enables all the GraphUnits to generate vertex-cut partitioning in parallel. Furthermore, unlike existing graph processing systems, failure recovery in epiCG is completely automatic. We compare epiCG with several prevalent graph processing systems via extensive experiments with real-life dataset and applications. The results show that epiCG possesses high efficiency and scalability, and performs exceptionally well in large dataset settings, showcasing its suitability for large-scale graph processing.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call