Abstract
In the modern era of big data, large-scale graph computing has become challenging because of the dramatic rise in graph data size. Graph edge partitioning (GEP) is a crucial preprocessing step to distributed graph platforms, yet it is challenging to partition the large-scale graphs. GEP has shown better partition quality than the graph vertex partitioning for the graph’s skewed degree distribution. Existing GEP approaches are classified into two as stream and offline. The former category assigns edges to the partitions based on the previously received edge information. It has less partitioning quality and is affected by stream order compared to the latter while supporting big graph partitioning. The latter uses complete knowledge of a graph during partitioning and hence has a better partitioning quality than the former; however, it does not support large-scale graphs. In this study, we propose a novel OffStream partitioning approach (OSPA) and hybrid graph edge partitioner OffStreamNH. OSPA leverages both the offline and stream graph partitioning approaches through stateful partitioning by introducing a state layer. This stateful partition state is recorded while offline is partitioning its input graph. It contains partial knowledge of previously partitioned data and is used by the stream partitioner. The OffStreamNH uses Neighborhood Expansion (NE) and Higher Degree Replicated First (HDRF) algorithms for the offline and online; respectively, with minor modifications of both algorithms. Experimental results show that OffStreamNH outperforms the state of the art stream partitioners in terms of replication factor, load balance and tolerates the effect of stream orders.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.