Abstract

We have previously proposed an extended relational data model with the objective of supporting uncertain information in a consistent and coherent manner. The model, which can represent both uncertainty and imprecision in data, is based on the Dempster-Shafer (D-S) theory of evidence, and it uses bel and pls functions of the theory, with their definitions extended somewhat for this purpose. Relational operations such as Select, Cartesian Product, Join, Project, Intersect, and Union have previously been defined [21]. In this paper we consider two data combination problems associated with the data model. These problems are believed to be inherent in most database models which handle uncertain information. The problems are: the potential existence in the database of identical tuples which have different respective degrees of belief (the redundancy problem), and the potential existence of different tuples with the same key values (the inconsistency problem). The redundancy problem was treated to some extent in an earlier paper, but the inconsistency problem has not been considered at all yet. Now the well-known orthogonal sum operation in the D-S theory, which performs the pooling of data for the purpose of making choices between hypotheses, may be viewed as a means of reducing inconsistency in data arising from different sources. This capability has not yet been exploited in our data model. So the idea here is to define a new combine operation as a primitive for handling inconsistency in relations. When data from a number of sources is being pooled — often in order to support decision making — the Union operation, and the Project operation, are very important. We are particularly interested in the case where tuples in operand relations match attribute-wise, but have different uncertainty and imprecision characteristics. The execution of both the Union and Project operations, which the new combine operation can help solve, is a means of dealing with the problem of information aggregation. We use the orthogonal sum, which generalizes results from traditional probability theory in a natural and correct manner, for pooling evidence during the combine computation. The paper also addresses the execution efficiency of our suggested approach. The orthogonal sum operation is exponentially complex if implemented naively. A linear time algorithm can readily be made available for Union and Project for the simple case where the attribute values to be combined are singletons (i.e., atomic values — as in the conventional relational model). However, many potential applications of the approach can exploit the new data model's facility of supporting set-valued attributes. In the method presented here we can combine data supporting non-singleton subsets in linear-time.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.