Abstract

Even if specialized distributed graph processing systems such as Pregel scale better than pure MapReduce programs, in graph processing, by reducing disk I/O for iterative algorithms while offering an easy programming model using “think like vertex” paradigm, large-scale graph processing is still challenging in the presence of high-degree vertices: communication and load imbalance among processing nodes can have disastrous effects on performance. In this article, we introduce a scalable MapReduce graph partitioning approach for high-degree vertices using master/slave partitioning. This partitioning makes Pregel-like systems, in graph processing, scalable and insensitive to the effects of high-degree vertices while guaranteeing perfect balancing properties of communication and computation during all the stages of big graph processing. A cost model and performance analysis is given to show the effectiveness and the scalability of our graph partitioning approach in large-scale systems.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call