Abstract

The increasing importance of graph data in various fields requires large-scale graph data to be processed efficiently. Furthermore, well-balanced graph partitioning is a vital component of parallel/distributed graph processing. The goal of graph partitioning is to obtain a well-balanced graph topology, where the size of each partition is balanced while the number of edge cuts is reduced. Moreover, a graph-partitioning algorithm should achieve high performance and scalability. In this study, we present a novel graph-partitioning algorithm that ensures a high edge cutting quality and excellent parallel processing performance. We apply formulas based on the label propagation algorithm to improve the quality of edge cuts and achieve fast convergence. In our approach, the necessity of applying the label propagation process for all vertices is removed, and the process is applied only for candidate vertices based on a score metric. Our proposed algorithm introduces a stabilization phase in which remote and highly connected vertices are relocated to prevent the algorithm from becoming trapped in local optima. Comparison results show that a prototype based on the proposed algorithm outperforms well-known parallel graph-partitioning frameworks in terms of speed and balance.

Highlights

  • Graph data have become increasingly important for applications in various fields, such as e-science, medical information systems, and social data management systems [1]

  • PARALLEL GRAPH-PARTITIONING ALGORITHM we describe the parallel processing of our graph-partitioning algorithm consisting of quick-converging label propagation’’ (QCLP), as well as the stabilization phase

  • Because QCLP and higher connectivity to remote vertices (HCRV) are similar but use different flows, we describe only those parts that differ from the QCLP phase

Read more

Summary

INTRODUCTION

Graph data have become increasingly important for applications in various fields, such as e-science, medical information systems, and social data management systems [1]. Many previous approaches for graph partitioning are based on a local search algorithm, such as the Kernighan–Lin (KL) [20] and Fiduccia–Mattheyses (FM) algorithms [21] These algorithms require considerable computation to obtain the optimal edge cut as the number of vertices, i.e., nodes, and partitions increases. We propose a novel parallel graphpartitioning algorithm that provides a low edge cut degree and high performance processing capability for large-scale graph data. Distributed machines need to share the new position, i.e., partition, or vertex score for the iteration in LP In this process, a correlation exists between the overhead of the data update frequency and the accuracy of the vertex position. M. Bae et al.: LP-Based Parallel Graph Partitioning for Large-Scale Graph Data vertex position being more accurate, which in turn increases the quality of the edge cut partitioning.

RELATED WORK
DEFINITION OF THE SCORE
10: Update T ScoreL
EXPERIMENT
Findings
CONCLUSION

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.