Abstract

A database stores data in order to provide the user with information. However, how a database may achieve this is not always clear. The main reason for this seems that we, who are in the database community, have not fully understood and therefore clearly defined the notion of “the information that data in a database carry”, in other words, “the information content of data”. As a result, databases’ capability is limited in terms of answering queries, especially, when users explore information beyond the scope of data stored in a database, the database normally cannot provide it. The underlying reason of the problem is that queries are answered based on a direct match between a query and data (up to aggregations of the data). We observe that this is because the information that data carry is seen as exactly the data per se. To tackle this problem, we propose the notion of information content inclusion relation, and show that it formulates the intuitive notion of the “information content of data” and then show how this notion may be used for the derivation of information from data in a database.

Highlights

  • When we query a database, it is said that we are retrieving information from it

  • We propose the notion of information content inclusion relation, and show that it formulates the intuitive notion of the “information content of data” and show how this notion may be used for the derivation of information from data in a database

  • The F above are called the original Inclusion Relations (IIR), which are identified by applying the definition of IIR directly to a variety of sources such as the real world, database systems and domain knowledge, and which are not those that are derivable by using the inference rules on known IIR

Read more

Summary

Introduction

When we query a database, it is said that we are retrieving information from it. This is taken for granted. It would appear that the notion of “information content of data” is elusive It has been taken as the instance of a database and the information capacity of a data schema as the collection of instances of the schema [9,10,11]. We define the following research question that we tackle in this paper: how the “information content” of data in a database may be defined with mathematical rigor, and how this notion after have been defined may help retrieve information through reasoning that cannot otherwise be possible through conventional queries. To answer this research question, we purpose to look at the relationships between the information content of data, database structure and domain knowledge, which may be captured as business rules.

Information Content
Random Variables
Random Events
Particulars of Random Events
Information Content of a State of Affairs
Types and Sources of IIR
Rules for Inferences on and of IIR
The Closure of a Set of IIRs
IIR Closure of a Random Event
A System for Querying a Database with
Conventions
IIR Inference Rules
An Example of IIR Closures
What We Learnt from This Example
An Algorithm for Computing IIR Closures
An Example of Querying a Database Using
Contributions of This Work
Conclusions
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call