An Approach for Optimizing Relational Provenance Storage

Li-Wei Wang,Zhi-Feng Bao,Koehler Henning,Xiao-Fang Zhou,Sadiq Shazia

doi:10.3724/sp.j.1016.2011.01863

Abstract

Modern data management has to deal with data from different sources with different quality, therefore, supporting data provenance in the system level and allowing users to know where data comes from and how it was derived have become a critical research topic. Annotation is one of approaches to track provenance. However, storing fine-grained annotations can be expensive as the complete annotations for the data may outsize the storage space required for the data itself. In this paper, we propose a framework for storing provenance information relating to data derived via relational queries, using provenance trees which match the query structure to avoid redundant storage of information about the derivation process. Within this framework, we come up with a series of storage optimization methods against the relational queries to make good choices of query tree nodes where provenance information should be stored. Our optimization algorithms run in time polynomial in the query size and linear in the size of the provenance, thus enabling provenance tracking and optimization without incurring large overheads. This framework is a new idea for the data tracing study and has a wide range of applications.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

An Approach for Optimizing Relational Provenance Storage

Abstract

Talk to us

Similar Papers

More From: Chinese Journal of Computers

Lead the way for us

Journal: Chinese Journal of Computers	Publication Date: Oct 28, 2011
Citations: 3

Similar Papers

Efficient provenance storage for relational queries
Zhifeng Bao ... Henning Köhler
-
Zhifeng Bao, et. al.Zhifeng Bao ... Henning Köhler
29 Oct 2012
29 Oct 2012

Approaches and tools for user-driven provenance and data quality information in spatial data infrastructures
Julia Fischer ... Ralf Seppelt
International Journal of Digital Earth | VOL. 16
Julia Fischer, et. al.Julia Fischer ... Ralf Seppelt
24 Apr 2023
International Journal of Digital Earth | VOL. 16

Storage and Use of Provenance Information for Relational Database Queries
Zhifeng Bao ... Tok Wang Ling
-
Zhifeng Bao, et. al.Zhifeng Bao ... Tok Wang Ling
01 Jan 2010
01 Jan 2010

Secure Yannakakis
Yilei Wang ... Ke Yi
-
Yilei Wang, et. al.Yilei Wang ... Ke Yi
09 Jun 2021
09 Jun 2021

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

An Approach for Optimizing Relational Provenance Storage

Abstract

Talk to us

Similar Papers

More From: Chinese Journal of Computers