Abstract

We study the problem of query evaluation on probabilistic graphs, namely, tuple-independent probabilistic databases over signatures of arity two. We focus on the class of queries closed under homomorphisms, or, equivalently, the infinite unions of conjunctive queries. Our main result states that the probabilistic query evaluation problem is #P-hard for all unbounded queries from this class. As bounded queries from this class are equivalent to a union of conjunctive queries, they are already classified by the dichotomy of Dalvi and Suciu (2012). Hence, our result and theirs imply a complete data complexity dichotomy, between polynomial time and #P-hardness, on evaluating homomorphism-closed queries over probabilistic graphs. This dichotomy covers in particular all fragments of infinite unions of conjunctive queries over arity-two signatures, such as negation-free (disjunctive) Datalog, regular path queries, and a large class of ontology-mediated queries. The dichotomy also applies to a restricted case of probabilistic query evaluation called generalized model counting, where fact probabilities must be 0, 0.5, or 1. We show the main result by reducing from the problem of counting the valuations of positive partitioned 2-DNF formulae, or from the source-to-target reliability problem in an undirected graph, depending on properties of minimal models for the query.

Highlights

  • The management of uncertain and probabilistic data is an important problem in many applications, e.g., automated knowledge base construction [DGH+14, HSBW13, MCH+15], data integration from diverse sources, predictive and stochastic modeling, applications based on sensor readings, etc

  • Our result implies a dichotomy on probabilistic query evaluation (PQE) for unions of conjunctive queries (UCQs)∞ over such graphs: as bounded UCQ∞ queries are equivalent to UCQs, they are already classified by Dalvi and Suciu, and we show that all other UCQ∞ queries are unsafe, i.e., the PQE

  • We have shown that PQE is #P-hard for any unbounded UCQ∞ over an arity-two signature, and proved a dichotomy on PQE for all UCQ∞ queries: either they are unbounded and PQE is #P-hard, or they are bounded and the dichotomy by Dalvi and Suciu applies

Read more

Summary

Introduction

The management of uncertain and probabilistic data is an important problem in many applications, e.g., automated knowledge base construction [DGH+14, HSBW13, MCH+15], data integration from diverse sources, predictive and stochastic modeling, applications based on (error-prone) sensor readings, etc. Dalvi and Suciu [DS12] obtained a dichotomy for evaluating unions of conjunctive queries (UCQs) on tuple-independent probabilistic databases Their dichotomy is measured in data complexity, i.e., as a function of the input TID and with the query being fixed. In the terminology of Dalvi and Suciu, a UCQ Q is called safe if PQE(Q) can be computed in polynomial time, and it is called unsafe otherwise This dichotomy result laid the foundation for many other studies on the complexity of probabilistic query evaluation [ABS16, CDV21, FO16, JL12, OH08, OH09, RS09]. We focus on all other unbounded queries in UCQ∞, i.e., UCQ∞ queries with no model featuring such a non-iterable edge For these queries, we give a reduction from the source-to-target connectivity problem in an undirected graph (#U-ST-CON).

Related Work
Preliminaries
Result
Hardness with Non-Iterable Edges
Finding a Minimal Tight Pattern
Hardness with Tight Iterable Edges
Generalizations of the Dichotomy Result
Findings
Conclusions

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.