Set-valued Attributes Research Articles

We have previously proposed an extended relational data model with the objective of supporting uncertain information in a consistent and coherent manner. The model, which can represent both uncertainty and imprecision in data, is based on the Dempster-Shafer (D-S) theory of evidence, and it uses bel and pls functions of the theory, with their definitions extended somewhat for this purpose. Relational operations such as Select, Cartesian Product, Join, Project, Intersect, and Union have previously been defined [21]. In this paper we consider two data combination problems associated with the data model. These problems are believed to be inherent in most database models which handle uncertain information. The problems are: the potential existence in the database of identical tuples which have different respective degrees of belief (the redundancy problem), and the potential existence of different tuples with the same key values (the inconsistency problem). The redundancy problem was treated to some extent in an earlier paper, but the inconsistency problem has not been considered at all yet. Now the well-known orthogonal sum operation in the D-S theory, which performs the pooling of data for the purpose of making choices between hypotheses, may be viewed as a means of reducing inconsistency in data arising from different sources. This capability has not yet been exploited in our data model. So the idea here is to define a new combine operation as a primitive for handling inconsistency in relations. When data from a number of sources is being pooled — often in order to support decision making — the Union operation, and the Project operation, are very important. We are particularly interested in the case where tuples in operand relations match attribute-wise, but have different uncertainty and imprecision characteristics. The execution of both the Union and Project operations, which the new combine operation can help solve, is a means of dealing with the problem of information aggregation. We use the orthogonal sum, which generalizes results from traditional probability theory in a natural and correct manner, for pooling evidence during the combine computation. The paper also addresses the execution efficiency of our suggested approach. The orthogonal sum operation is exponentially complex if implemented naively. A linear time algorithm can readily be made available for Union and Project for the simple case where the attribute values to be combined are singletons (i.e., atomic values — as in the conventional relational model). However, many potential applications of the approach can exploit the new data model's facility of supporting set-valued attributes. In the method presented here we can combine data supporting non-singleton subsets in linear-time.

Read full abstract

We have previously proposed an extended relational data model with the objective of supporting uncertain information in a consistent and coherent manner. The model, which can represent both uncertainty and imprecision in data, is based on the Dempster-Shafer (D-S) theory of evidence, and it uses bel and pls functions of the theory, with their definitions extended somewhat for this purpose. Relational operations such as Select, Cartesian Product, Join, Project, Intersect, and Union have previously been defined [21].In this paper we consider two data combination problems associated with the data model. These problems are believed to be inherent in most database models which handle uncertain information. The problems are: the potential existence in the database of identical tuples which have different respective degrees of belief (the redundancy problem), and the potential existence of different tuples with the same key values (the inconsistency problem). The redundancy problem was treated to some extent in an earlier paper, but the inconsistency problem has not been considered at all yet.Now the well-known orthogonal sum operation in the D-S theory, which performs the pooling of data for the purpose of making choices between hypotheses, may be viewed as a means of reducing inconsistency in data arising from different sources. This capability has not yet been exploited in our data model. So the idea here is to define a new combine operation as a primitive for handling inconsistency in relations.When data from a number of sources is being pooled — often in order to support decision making — the Union operation, and the Project operation, are very important. We are particularly interested in the case where tuples in operand relations match attribute-wise, but have different uncertainty and imprecision characteristics. The execution of both the Union and Project operations, which the new combine operation can help solve, is a means of dealing with the problem of information aggregation. We use the orthogonal sum, which generalizes results from traditional probability theory in a natural and correct manner, for pooling evidence during the combine computation.The paper also addresses the execution efficiency of our suggested approach. The orthogonal sum operation is exponentially complex if implemented naively. A linear time algorithm can readily be made available for Union and Project for the simple case where the attribute values to be combined are singletons (i.e., atomic values — as in the conventional relational model). However, many potential applications of the approach can exploit the new data model's facility of supporting set-valued attributes. In the method presented here we can combine data supporting non-singleton subsets in linear-time.

Read full abstract

Set-valued Attributes Research Articles

Related Topics

Articles published on Set-valued Attributes

A Study on Improving Metrics of Information Loss in Anonymization of Semi-structured Transaction Data with Set-valued Attributes

Selectivity Estimation for Queries Containing Predicates over Set-Valued Attributes

Internal and external memory set containment join

Designing efficient algorithms for querying large corpora

FreshJoin: An Efficient and Adaptive Algorithm for Set Containment Join

A graph-based multifold model for anonymizing data with attributes of multiple types

A fuzzy SV-k-modes algorithm for clustering categorical data with set-valued attributes

Efficient parallel boolean matrix based algorithms for computing composite rough set approximations

A Locality Sensitive Hashing Technique for Categorical Data

On the Signature Tree Construction and Analysis

A performance study of four index structures for set-valued attributes of low cardinality

Adaptive algorithms for set containment joins

User profiling for the melvil knowledge retrieval system

Signature-based structures for objects with set-valued attributes

Induction of decision trees in numeric domains using set-valued attributes

A cost model for sort-domain traversal strategy in object-oriented databases

Generalized union and project operations for pooling uncertain and imprecise information

Fast algorithms for universal quantification in large databases

Optimal block size for set-valued attributes

Improved reliability predictions for commerical computers : N. Keith Hergatt. Proc. A. Reliab. Maintainab. Symp., 357 (1991)

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Set-valued Attributes Research Articles

Related Topics

Articles published on Set-valued Attributes

A Study on Improving Metrics of Information Loss in Anonymization of Semi-structured Transaction Data with Set-valued Attributes

Selectivity Estimation for Queries Containing Predicates over Set-Valued Attributes

Internal and external memory set containment join

Designing efficient algorithms for querying large corpora

FreshJoin: An Efficient and Adaptive Algorithm for Set Containment Join

A graph-based multifold model for anonymizing data with attributes of multiple types

A fuzzy SV-k-modes algorithm for clustering categorical data with set-valued attributes

Efficient parallel boolean matrix based algorithms for computing composite rough set approximations

A Locality Sensitive Hashing Technique for Categorical Data

On the Signature Tree Construction and Analysis

A performance study of four index structures for set-valued attributes of low cardinality

Adaptive algorithms for set containment joins

User profiling for the melvil knowledge retrieval system

Signature-based structures for objects with set-valued attributes

Induction of decision trees in numeric domains using set-valued attributes

A cost model for sort-domain traversal strategy in object-oriented databases

Generalized union and project operations for pooling uncertain and imprecise information

Fast algorithms for universal quantification in large databases

Optimal block size for set-valued attributes

Improved reliability predictions for commerical computers : N. Keith Hergatt. Proc. A. Reliab. Maintainab. Symp., 357 (1991)