Handling Pregel’s Limits in Big Graph Processing in the Presence of High-Degree Vertices

Mohamad Al Hajj Hassan,Mostafa Bamha

doi:10.1007/978-3-319-76472-6_8

Abstract

Even if specialized distributed graph processing systems such as Pregel scale better than pure MapReduce programs, in graph processing, by reducing disk I/O for iterative algorithms while offering an easy programming model using “think like vertex” paradigm, large-scale graph processing is still challenging in the presence of high-degree vertices: communication and load imbalance among processing nodes can have disastrous effects on performance. In this article, we introduce a scalable MapReduce graph partitioning approach for high-degree vertices using master/slave partitioning. This partitioning makes Pregel-like systems, in graph processing, scalable and insensitive to the effects of high-degree vertices while guaranteeing perfect balancing properties of communication and computation during all the stages of big graph processing. A cost model and performance analysis is given to show the effectiveness and the scalability of our graph partitioning approach in large-scale systems.

Full Text