RBSEP: a reassignment and buffer based streaming edge partitioning approach

Monireh Taimouri,Hamid Saadatfar

doi:10.1186/s40537-019-0257-5

Monireh Taimouri, Hamid Saadatfar

Open Access

PDF Available

https://doi.org/10.1186/s40537-019-0257-5

Copy DOI

Export

Save

Cite

Journal: Journal of Big Data	Publication Date: Oct 19, 2019
Citations: 4	License type: open-access

Affiliation: University of Birjand

Abstract
Highlights/Summary
Full-Text PDF
Similar Papers

Abstract

Listen

In recent years, the rapid growth of the Internet has led to creation of massively large graphs. Since databases have become very large nowadays, they cannot be processed by a simple machine at an acceptable time anymore; therefore, traditional graph partitioning methods, which are often based on having a complete image of the entire graph, are not applicable to large datasets. This challenge has led to the appearance of a new approach called streaming graph partitioning. In streaming graph partitioning, a stream of input data is received by a partitioner, and partitioner decides which computational machine the data should be transferred to. Often, streaming partitioner does not have any information about the whole graph, and usually distributes the vertices based on some greedy heuristics which may not be optimal for incoming vertices. Hence, partitioner’s decision can be significantly improved if more information about the graph is utilized. In this paper, we present a new vertex-cut streaming graph partitioning approach. The proposed method uses the idea of postponing the decision for some of the edges (by means of an intelligent buffering) and corrects some of the past decisions to improve the quality of the graph partitioning. The proposed approach is evaluated using from real-world graphs. The experimental results show that the performance of the proposed method is superior in comparison with the previous HDRF method.

Highlights

In recent years rapid development of Internet has led to the emergence of large graphs
We propose the Reassignment and Buffer based Streaming Edge Partitioning (RBSEP) that produces balanced partitions, and improves partitioning quality in terms of vertex-cut
(1) A partitioner like high degree replicated first (HDRF) assigns edges that have not been copied to any partition, to the partition with smallest number of edges only based on its balanced criterion

Summary

Introduction

In recent years rapid development of Internet has led to the emergence of large graphs. Real-world graphs follow power-law distribution with few high degree vertices and many low degree vertices. It has been shown that edge partitioning can be more efficient for partitioning of power-law graphs [7, 8]. Problem statement Natural graphs have a prominent property which is their skewed power-law degree distribution. It means most of the vertices have relatively few neighbors, while a few vertices have many neighbors and the probability that a vertex has degree d is P(d) ∝ d−α (1). To formally define the k-way vertex-cut partitioning problem, we represent a graph as follows: G = ( E, V) where V is the set of vertices and E is the set of edges, the set of partitions is. Note that the input data is a random list of edges which are received and processed by partitioner in a streaming manner

Methods

Results

Conclusion