Abstract

Ontology-based data access (OBDA) is a popular approach for integrating and querying multiple data sources by means of a shared ontology. The ontology is linked to the sources using mappings, which assign to ontology predicates views over the data. The conventional semantics of OBDA is set-based—that is, the extension of the views defined by the mappings does not contain duplicate tuples. This treatment is, however, in disagreement with the standard semantics of database views and database management systems in general, which is based on bags and where duplicate tuples are retained by default. The distinction between set and bag semantics in databases is very significant in practice, and it influences the evaluation of aggregate queries.In this article, we propose and study a bag semantics for OBDA which provides a solid foundation for the future study of aggregate and analytic queries. Our semantics is compatible with both the bag semantics of database views and the set-based conventional semantics of OBDA. Furthermore, it is compatible with existing bag-based semantics for data exchange recently proposed in the literature. We show that adopting a bag semantics makes conjunctive query answering in OBDA coNP-hard in data complexity. To regain tractability of query answering, we consider suitable restrictions along three dimensions, namely, the query language, the ontology language, and the adoption of the unique name assumption. Our investigation shows a complete picture of the computational properties of query answering under bag semantics over ontologies in the DL-Lite family.

Highlights

  • Ontology-based data access (OBDA) is an increasingly popular approach to enable uniform access to multiple data sources with diverging schemas [2,3,4,5,6,7]

  • For DL-LiteR, the certain answers always coincide with the certain answers under unique name assumption (UNA), and checking whether a tuple of individuals is in the certain answers to a (U)conjunctive query (CQ) q over a DL-LiteR ontology T, A is an NP-complete problem with AC0 data complexity [9,24]

  • The second and third lower bounds are established in Theorem 29, where we show similar coNP-hardness results for the cases where the query language is restricted to the class of rooted CQs and the ontology language is allowed to contain role inclusions

Read more

Summary

Introduction

Ontology-based data access (OBDA) is an increasingly popular approach to enable uniform access to multiple data sources with diverging schemas [2,3,4,5,6,7]. The following is a relational database instance Dex providing information about the records that trumpeter Miles Davis and pianist Keith Jarrett have cut on these two labels To integrate this data, the music encyclopedia relies on a DL-LiteR ontology with TBox Tex, which defines unary predicates, called concepts, such as Musician, WindPlayer, and Record, and binary predicates, called roles, such as hasMusician. This discrepancy between OBDA semantics and the semantics of database views may occur even if the TBox of the ontology is empty In such a case the evaluation of qex(x) over ABox Aex does not coincide with the evaluation of the rewritten query ( σ1 (x) in this case) over Dex. Example 2 suggests that the conventional approach to OBDA can faithfully represent only a subset of GAV mapping assertions—those whose SQL query contains the DISTINCT operator in the top-level SELECT clause. Our semantics is compatible with (i) the bag semantics of database views; (ii) the set-based conventional semantics of OBDA; and (iii) the bag semantics recently proposed by Hernich and Kolaitis [20] in the context of data exchange

Contributions and organisation
Preliminaries
Syntax and semantics of DL-LiteR ontologies
Queries over ontologies
A calculus for querying bag databases
Ontology-based data access under bag semantics
The ontology language DL-LitebR
The syntax and semantics of DL-LitebR
Relationship to query answering in OBDA
Relationship of bag and set semantics in the context of DL-LiteR
Unique name assumption
Universal models
Lower bounds for the data complexity of query answering under bag semantics
Universal models for rooted conjunctive queries over DL-LitebCORE ontologies
Rewritability of rooted conjunctive queries over DL-LitebCORE
Non-rewritability to BCALC unions of conjunctive queries
General ideas for rewritability to BCALC queries
Step 1: checking for realisability
Step 2: replacing subqueries with representatives
Step 3: rewriting atoms to BCALC queries
Rewriting and complexity
Rewritability of conjunctive queries over DL-LitebRDFS under UNA
11.1. Bag semantics in data exchange
11.2. Count aggregate queries over ontologies
11.3. Other related work
12. Conclusion and future work
Operations on tuples:
Operations on bags of tuples:
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call