Optimized Distributed Subgraph Matching Algorithm Based on Partition Replication

Ling Yuan,Peng Pan,Jiali Bin

doi:10.3390/electronics9010184

Ling Yuan, Peng Pan + Show 1 more

Open Access

https://doi.org/10.3390/electronics9010184

Copy DOI

Journal: Electronics	Publication Date: Jan 18, 2020
Citations: 2	License type: CC BY 4.0

Affiliation: Huazhong University of Science and Technology

Abstract

At present, with the explosive growth of data scale, subgraph matching for massive graph data is difficult to satisfy with efficiency. Meanwhile, the graph index used in existing subgraph matching algorithm is difficult to update and maintain when facing dynamic graphs. We propose a distributed subgraph matching algorithm based on Partition Replica (noted as PR-Match) to process the partition and storage of large-scale data graphs. The PR-Match algorithm first splits the query graph into sub-queries, then assigns the sub-query to each node for sub-graph matching, and finally merges the matching results. In the PR-Match algorithm, we propose a heuristic rule based on prediction cost to select the optimal merging plan, which greatly reduces the cost of merging. In order to accelerate the matching speed of the sub-query graph, a vertex code based on the vertex neighbor label signature is proposed, which greatly reduces the search space for the subquery. As the vertex code is based on the increment, the problem that the feature-based graph index is difficult to maintain in the face of the dynamic graph is solved. An abundance of experiments on real and synthetic datasets demonstrate the high efficiency and strong scalability of the PR-Match algorithm when handling large-scale data graphs.

Highlights

A graph is a semi-structured data represented by vertices and edges, which is usually represented as G (V, E), where V represents the set of vertices and E the set of edges between vertices
We propose a distributed subgraph matching algorithm based on Partition Replica to process the subgraph matching of large-scale data graph
PR-Match algorithm, we design a large-scale data graph partition and storage scheme based on the theory of equilibrium separation of large graphs, develop a high efficient vertex code index to process fast updating and maintenance on dynamic graphs, and establish the heuristic rules based on the prediction overhead to determine the merging sequence of subquery matching results

Summary

Introduction

A graph is a semi-structured data represented by vertices and edges, which is usually represented as G (V, E), where V represents the set of vertices and E the set of edges between vertices. Existing distributed subgraph matching mainly uses an RDF graph engine and map-reduce computing framework, which can hardly achieve satisfying efficiency. To solve these problems, we propose a distributed subgraph matching algorithm based on Partition Replica (noted as PR-Match) to process the subgraph matching of large-scale data graph. PR-Match algorithm, we design a large-scale data graph partition and storage scheme based on the theory of equilibrium separation of large graphs, develop a high efficient vertex code index to process fast updating and maintenance on dynamic graphs, and establish the heuristic rules based on the prediction overhead to determine the merging sequence of subquery matching results.

Related Work

Problem Definition

Graph Data Partition

Query Decomposition

Subquery Matching

Intermediate Result Merge

Subgraph Matching on Small Graphs

Path Query

Clique Query

Random Query

Scalability Test of PR-Match Algorithm

Data Size

Average Vertex Degree

Experiment Summary

Conclusions

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Optimized Distributed Subgraph Matching Algorithm Based on Partition Replication

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Electronics

Lead the way for us

Similar Papers

MPMatch: A Multi-core Parallel Subgraph Matching Algorithm
Xin Jin ... Longbin Lai
-
Xin Jin, et. al.Xin Jin ... Longbin Lai
01 Apr 2019
01 Apr 2019

Efficient Subgraph Matching on Non-volatile Memory
Yishu Shen ... Zhaonian Zou
-
Yishu Shen, et. al.Yishu Shen ... Zhaonian Zou
01 Jan 2017
01 Jan 2017

A survey of continuous subgraph matching for dynamic graphs
Xi Wang ... Xiang Zhao
Knowledge and information systems | VOL. 65
Xi Wang, et. al.Xi Wang ... Xiang Zhao
19 Oct 2022
Knowledge and information systems | VOL. 65

OSMAC: Optimizing Subgraph Matching Algorithms with Community Structure
Yunkai Lou ... Chaokun Wang
-
Yunkai Lou, et. al.Yunkai Lou ... Chaokun Wang
01 Apr 2019
01 Apr 2019

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Optimized Distributed Subgraph Matching Algorithm Based on Partition Replication

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Electronics