Abstract

The Web of Data aims at linking Internet data repositories. Semantic Web technologies make data easily readable by computer agents, enabling the automation of complex tasks and facilitating data integration. They facilitate the achievement of the Web of Data in which users can query the connected datasets in the search engine style, i.e. by using keywords. However, querying semantic repositories in a friendly way, not requiring the mastering of query languages such as SPARQL, is still a challenging task. In this work, we present Semankey, an approach for the automatic building of SPARQL queries from a list of keywords entered by the user. Semankey identifies semantic entities in the keywords by using a domain ontology to interpret the query meaning and automatically builds a set of queries by connecting the entities through the relationships described in the ontology and by applying query size-based heuristics. The main contributions of Semankey are the use of query filters and the generation of multiple SPARQL queries derived from the different interpretations of the given input, according to the underlying domain ontology. We used the data from the Question Answering over Linked Data challenge for evaluating our approach in different execution modes and for analyzing the query trees generated, obtaining a precision of 0.52 and a recall of 0.60 when considering the best answer provided per test case.

Highlights

  • D ATA have become an invaluable asset in fields such as research and business

  • The keywords are analysed by using the semantic context specified by a domain ontology and the Resource Description Framework (RDF) data, which conforms to the domain ontology, together with a set of predefined filter expressions

  • At least one SPARQL query retrieved the right answer for 55 test cases

Read more

Summary

Introduction

D ATA have become an invaluable asset in fields such as research and business. Traditionally, humans have analysed data to acquire new knowledge or to make decisions. The number of ontologies and RDF repositories doubled between 2011 and 2014 [5] This trend is reflected by the increasing number of RDF repositories on Linked Open Data Cloud (https://lod-cloud.net/) since 2014 (1255 as of May 2020). These repositories include data from multiple domains, such as medicine [6], genetics [7], biology [8], and tourism [9], among others, and use a semantic representation format that allows its automatic processing by computers and enables the development of advanced semantic-based applications. The lack of user-friendly query interfaces makes their exploitation difficult by non experts in semantic technologies, and in SPARQL

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call