Infinite Probabilistic Databases

Martin Grohe,Peter Lindner

doi:10.46298/lmcs-18(1:34)2022

Abstract

Probabilistic databases (PDBs) model uncertainty in data in a quantitative way. In the established formal framework, probabilistic (relational) databases are finite probability spaces over relational database instances. This finiteness can clash with intuitive query behavior (Ceylan et al., KR 2016), and with application scenarios that are better modeled by continuous probability distributions (Dalvi et al., CACM 2009). We formally introduced infinite PDBs in (Grohe and Lindner, PODS 2019) with a primary focus on countably infinite spaces. However, an extension beyond countable probability spaces raises nontrivial foundational issues concerned with the measurability of events and queries and ultimately with the question whether queries have a well-defined semantics. We argue that finite point processes are an appropriate model from probability theory for dealing with general probabilistic databases. This allows us to construct suitable (uncountable) probability spaces of database instances in a systematic way. Our main technical results are measurability statements for relational algebra queries as well as aggregate queries and Datalog queries.

Highlights

Probabilistic databases (PDBs) are used to model uncertainty in data
In [GL19], we introduced an extended model of PDBs as arbitrary probability spaces over finite database instances
We have introduced views as functions mapping database instances to database instances and adopted a semantics based on possible worlds

Summary

Introduction

Probabilistic databases (PDBs) are used to model uncertainty in data. Such uncertainty can have various reasons like, for example, noisy sensor data, the presence of incomplete or inconsistent information, or information gathered from unreliable sources [Agg[09], SORK11]. In the standard formal framework, probabilistic databases are finite probability spaces whose sample spaces consist of database instances in the usual sense, referred to as “possible worlds”. This framework has various shortcomings due to its inherent closed-world assumption [CDVdB16, CDVdB21], and the restriction to finite domains. Statistical models of uncertain data, say, for example, for temperature measurements as in Example 2.1, usually feature the use of continuous probability distributions in appropriate error models. This (continuous attribute-level uncertainty) is not expressible in the traditional PDB model. In particular with respect to an open-world assumption, we would like

Objectives

Methods

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Logical Methods in Computer Science	Publication Date: Feb 25, 2022
Citations: 3	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Infinite Probabilistic Databases

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Logical Methods in Computer Science

Lead the way for us

Similar Papers

Infinite Probabilistic Databases
...
-
, et. al. ...
02 Apr 2020
02 Apr 2020

Independence in Infinite Probabilistic Databases
Martin Grohe ... Peter Lindner
Journal of the ACM | VOL. 69
Martin Grohe, et. al.Martin Grohe ... Peter Lindner
27 Oct 2022
Journal of the ACM | VOL. 69

Probabilistic Data with Continuous Distributions
Martin Grohe ... Joost-Pieter Katoen
ACM SIGMOD Record | VOL. 50
Martin Grohe, et. al.Martin Grohe ... Joost-Pieter Katoen
15 Jun 2021
ACM SIGMOD Record | VOL. 50

Probabilistic approach for model and data uncertainties and its experimental identification in structural dynamics: Case of composite sandwich panels
C Chen ... C Soize
Journal of Sound and Vibration | VOL. 294
C Chen, et. al.C Chen ... C Soize
13 Dec 2005
Journal of Sound and Vibration | VOL. 294

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Infinite Probabilistic Databases

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Logical Methods in Computer Science