Abstract

The link traversal strategies to query Linked Data over WWW can retrieve up-to-date results using a recursive URI lookup process in real-time. The downside of this approach comes with the query patterns having subject unbound (i.e. ?S rdf:type:Class). Such queries fail to start up the traversal process as the RDF pages are subject-centric in nature. Thus, zero-knowledge link traversal leads to the empty query results for these queries. In this paper, the authors analyze a large corpus of real-world SPARQL query logs and identify the Most Frequent Predicates (MFPs) occurring in these queries. The knowledge of these MFPs helps in finding and indexing a limited number of triples from the original data set. Additionally, the authors propose a Hybrid Query Execution (HQE) approach to execute the queries over this index for initial data source selection followed by link traversal process to fetch complete results. The evaluation of HQE on the latest real data benchmarks reveals that it retrieves at least five times more results than the existing approaches.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call