Abstract

Path queries are used to specify paths inside a data graph to match a given pattern. Query languages such as SPARQL usually include support for regular path patterns defined by means of regular expressions. Context-free path queries define a path whose language can be defined by a context-free grammar. This kind of query is interesting in practice in domains such as genetics, data science, and source code analysis. In this paper, we present an algorithm for context-free path query processing. Our algorithm works by looking for localized paths, allowing us to process subgraphs, in contrast to other approaches that have to process the whole graph. It also takes any context-free grammar as input, avoiding the use of normal forms that can be problematic. The grammar normalization process may introduce a large number of non-terminal symbols and production rules, what, in general, reflects on more runtime and memory consumption by evaluation algorithms. We prove the correctness of our approach and show its runtime and memory complexity. We show the viability of our approach by means of prototypes implemented in Go and Python. We run experiments proposed in recent works, which include both synthetic and real RDF databases, and introduce a more realistic scenario inspired in Biology. Our algorithm shows performance gains when compared to other algorithms implemented using single-thread programs.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call