Incrementally updating unary inclusion dependencies in dynamic data

Nuhad Shaabani,Christoph Meinel

doi:10.1007/s10619-018-7233-5

Abstract

Inclusion dependencies form one of the most fundamental classes of integrity constraints. Their importance in classical data management is reinforced by modern applications like data profiling, data cleaning, entity resolution, and schema matching. Their discovery in an unknown dataset is at the core of any data-analysis effort. Therefore, several research approaches have focused on their efficient discovery in a given, static dataset. However, none of these approaches are appropriate for application on dynamic datasets. In these cases, discovery techniques should be able to efficiently update the inclusion dependencies after an update in the dataset, without reprocessing the entire dataset. We present the first approach for incrementally updating the unary inclusion dependencies. In particular, our approach is based on the concept of attribute clustering, from which the unary inclusion dependencies are efficiently derivable. We incrementally update the clusters after each update of the dataset. An update of the clusters does not need access to the dataset because of special data structures designed to efficiently support the updating process. We performed an exhaustive analysis of our approach by applying it to large datasets with several hundred attributes and more than 116.2 million tuples. The results showed that the incremental discovery significantly reduces the runtime needed by the static discovery. This reduction in the runtime is up to 99.9996% for both the insertion and the deletion.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Incrementally updating unary inclusion dependencies in dynamic data

Abstract

Talk to us

Similar Papers

More From: Distributed and Parallel Databases

Lead the way for us

Journal: Distributed and Parallel Databases	Publication Date: Aug 1, 2018
Citations: 5

Similar Papers

Incremental Discovery of Inclusion Dependencies
Nuhad Shaabani ... Christoph Meinel
-
Nuhad Shaabani, et. al.Nuhad Shaabani ... Christoph Meinel
27 Jun 2017
27 Jun 2017

Inclusion Dependencies Reloaded
Henning Köhler ... Sebastian Link
-
Henning Köhler, et. al.Henning Köhler ... Sebastian Link
17 Oct 2015
17 Oct 2015

Detecting unique column combinations on dynamic data
Ziawasch Abedjan ... Felix Naumann
-
Ziawasch Abedjan, et. al.Ziawasch Abedjan ... Felix Naumann
01 Mar 2014
01 Mar 2014

The effect of unary inclusion dependencies on relational database design
Yanchun Zhang ... Maria E Orlowska
Computers and Mathematics with Applications | VOL. 24
Yanchun Zhang, et. al.Yanchun Zhang ... Maria E Orlowska
01 Aug 1992
Computers and Mathematics with Applications | VOL. 24

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Incrementally updating unary inclusion dependencies in dynamic data

Abstract

Talk to us

Similar Papers

More From: Distributed and Parallel Databases