Semi-Oblivious Chase Termination: The Sticky Case

Marco Calautti,Andreas Pieris

doi:10.1007/s00224-020-09994-5

Abstract

The chase procedure is a fundamental algorithmic tool in database theory with a variety of applications. A key problem concerning the chase procedure is all-instances termination: for a given set of tuple-generating dependencies (TGDs), is it the case that the chase terminates for every input database? In view of the fact that this problem is undecidable, it is natural to ask whether known well-behaved classes of TGDs, introduced in different contexts such as ontological reasoning, ensure decidability. We consider a prominent paradigm that led to a robust TGD-based formalism, called stickiness. We show that for sticky sets of TGDs, all-instances chase termination is decidable if we focus on the (semi-)oblivious chase, and we pinpoint its exact complexity: PSpace-complete in general, and NLogSpace-complete for predicates of bounded arity. These complexity results are obtained via a graph-based syntactic characterization of chase termination that is of independent interest.

Highlights

The chase procedure is a fundamental algorithmic tool that has been successfully applied to several database problems such as containment of queries under constraints [2], checking logical implication of constraints [5, 27], This article belongs to the Topical Collection: Special Issue on Database Theory (ICDT 2019) Guest Editor: Pablo BarceloTheory of Computing Systems (2021) 65:84–121 computing data exchange solutions [17], and query answering under constraints [11], to name a few
– In Section 4, we provide a semantic characterization of non-termination of the semi-oblivious chase under sticky sets of tuple-generating dependencies (TGDs) via the existence of “path-like” infinite chase derivations, which forms the basis for our decision procedure
– By exploiting the above semantic characterization, we provide, in Section 5, a syntactic characterization of semi-oblivious chase termination via a graphbased condition

Summary

Introduction

The chase procedure (or chase) is a fundamental algorithmic tool that has been successfully applied to several database problems such as containment of queries under constraints [2], checking logical implication of constraints [5, 27], This article belongs to the Topical Collection: Special Issue on Database Theory (ICDT 2019) Guest Editor: Pablo BarceloTheory of Computing Systems (2021) 65:84–121 computing data exchange solutions [17], and query answering under constraints [11], to name a few. Somehow DΣ acts as a representative of all the models of D and Σ This is the reason for the ubiquity of the chase in database theory, as discussed in [15]. There are, in principle, three different ways for formalizing this simple idea, which lead to different versions of the chase procedure: Oblivious Chase The first one, which gives rise to the oblivious chase, is as follows: for each pair (t, u) of tuples of terms from the instance I constructed so far, apply a TGD σ of the form ∀x∀y (φ(x, y)→∃zψ(x, z)) if φ(t, u) ⊆ I , and σ has not been applied in a previous chase step due to the same pair (t, u), and add to I the set of atoms ψ(t, v), where vis a tuple of new nulls not occurring in I

Objectives

Results

Conclusion