Subgraph matching aims to find similar substructures in a single graph according to a given query graph and is known as a basic query for graph data management. There exist many categories of subgraph matching solutions. Subgraph isomorphism, which is thought of an NP-complete problem, is an initial solution for the subgraph matching task. To speed up the procedure, graph simulation has been presented to match subgraphs with a polynomial complexity of time. Unfortunately, graph simulation usually loses topologies of matched subgraphs because of its loose restrictions. In this paper, we propose an approximation approach named kSGM (top- k S ubraph G raph M atching) for subgraph matching based on twig patterns. First, we transform query graphs into twig patterns and match candidate substructures in graph data. Second, we present an optimized join strategy along with top-k mechanism, including join order selection based on cost evaluation and optimized pruning based on maximum/minimum possible score. Finally, we design experiments on real-life and synthetic graph data to evaluate the performance of our work. The results show that our proposed kSGM obviously reduces the time complexity and guarantee the correctness for answering the queries of subgraph matching compared to existing algorithms.
Read full abstract