Abstract

Evaluating Regular Path Queries (RPQs) have been of interest since they were used as a powerful way to explore paths and patterns in graph databases. Traditional automata-based approaches are restricted in the graph size and/or highly complex queries, which causes a high evaluation cost (e.g., memory space and response time) on large graphs. Recently, although using the approach based on the threshold rare label for large graphs has been achieving some success, they could not often guarantee the minimum searching cost. Alternatively, the Unit-Subquery Cost Matrix (USCM) has been studied and obtained the viability of the usage of subqueries. Nevertheless, this method has an issue, which is, it does not cumulate the cost among subqueries that causes the long response time on a large graph. In order to overcome this issue, this paper proposes a method for estimating joining cost of subqueries to accelerate the USCM based parallel evaluation of RPQs on a large graph, namely USCM-Join. Through real-world datasets, we experimentally show that the USCM-Join outperforms others and estimating the joining cost enhances the USCM based approach up to around 20% in terms of response time.

Highlights

  • For a given Regular Path Queries (RPQs), by estimating the searching cost of every possible set of its subqueries with Unit-Subqueries Cost Matrix (USCM), the RPQ is split into the best set of subqueries, which has the minimum of estimated searching cost

  • We observed that the USCM-Basic, which considers estimating only the searching cost, reduces the average response time approximately 13%, 56%, 17%, 25%, and 60% when compared to threshold rare label based approach (TRL) approach with Yago, Freebase, Alibaba, Smart Building, and the synthetic graph 320 K, respectively

  • This paper proposed a method of estimating the joining cost of subqueries in order to accelerate the USCM based parallel evaluation of regular path queries (RPQs), namely

Read more

Summary

Introduction

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. Nguyen et al [11] used Unit-Subqueries Cost Matrix (USCM) to estimate the searching cost of RPQs and obtain the viability of the usage of subqueries in RPQs evaluation This method does not take the joining cost among subqueries into account. We show how to improve the evaluation performance of RPQs by splitting them with a combination of the estimated joining and searching cost Through both real-world graphs and synthetic graphs, we experimentally demonstrate that the USCM-Join outperforms the original one and other approaches. Conference on Smart Media and Applications (SMA 2020) [12]

Related Work
Graph Data and Regular Path Queries
Uscm-Based Splitting Rpqs for Parallel Evaluation
USCM-Based Parallel Evaluation of RPQs by Estimating Joining Cost
Estimating Parallel Evaluation Cost
Parallel Evaluation of RPQs based on Minimum Estimated Evaluation Cost
Experimental Evaluation
Evaluation Settings
Experimental Results
Conclusions
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call