Abstract

Subgraph matching on a large graph has become a popular research topic in the field of graph analysis, which has a wide range of applications including question answering and community detection. However, traditional edge-cutting strategy destroys the structure of indivisible knowledge in a large RDF graph. On the premise of load-balancing on subgraph division, a dominance-partitioned strategy is proposed to divide a large RDF graph without compromising the knowledge structure. Firstly, a dominance-connected pattern graph is extracted from a pattern graph to construct a dominance-partitioned pattern hypergraph, which divides a pattern graph as multiple fish-shaped pattern subgraphs. Secondly, a dominance-driven spectrum clustering strategy is used to gather the pattern subgraphs into multiple clusters. Thirdly, the dominance-partitioned subgraph matching algorithm is designed to conduct all isomorphic subgraphs on a cluster-partitioned RDF graph. Finally, experimental evaluation verifies that our strategy has higher time-efficiency of complex queries, and it has a better scalability on multiple machines and different data scales.

Highlights

  • Data, an expensive cost is incurred to consume the excessive join operations over relational tables

  • We propose a dominance-partitioned subgraph matching algorithm to conduct all isomorphic subgraphs on a cluster-partitioned Resource description framework (RDF) graph

  • A ordered query graph is shown in Figure 11, where the rounds filled with left-diagonal line denote the first executed region, the rounds filled with rightdiagonal line indicate the second executed region, and the nonfilled rounds refer to the final executed region. e filled rounds are included in circular-pattern subgraphs and u7 is the regional juncture. en, the subgraph matching is iteratively conducted by our circular-pattern first matching order

Read more

Summary

Preliminaries

The definitions of RDF graph and subgraph matching are first given. en, the related researches are introduced. The iterative mapping and reducing operations of RDF triples can conduct expensive time-consumption on the complex topological structure of query graphs. TriAD [4] combined join-ahead pruning via the form of RDF graph summarization with a locality-based horizontal partitioning of RDF triples into a grid-like distributed index structure. E graph-based traversal strategies were employed to store RDF data in native graph format, which focused on the construction of data indexes and pruning rules of redundant intermediate results. BitMat [19] proposed a compressed bit-matrix structure to store the huge RDF graphs, and a variable-binding-matching algorithm was directly designed to produce the final results without indexing the intermediate results. E third empirical study [32] is our previous work for subgraph matching on static knowledge graph that constructed a flow-based subgraph index to reduce redundant RDF data.

Framework of Dominance-Partitioned RDF Graph
Subgraph Matching on k-Partitioned RDF Graph
Server 2
15 GraduateCourse 16 hasEmail dir 0 in 1 out
Findings
Conclusions
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call