Probabilistic XML Data Research Articles

Recently there is a growing interest in the data model and query processing for probabilistic XML data. There are many potential applications of probabilistic data, and the XML data model is suitable to represent hierarchical information and data uncertainty of different levels naturally. However, the previously proposed probabilistic XML data models and query processing techniques separate finding data matches with evaluating the probabilities of results. Therefore, they should repeatedly access the data and need to get full data of paths given in queries to calculate the probabilities of results.In this paper, we propose an extended interval-based labeling scheme for the probabilistic XML data tree and an efficient query processing procedure using the labeling scheme. Against previous researches, our method accesses only the labels of data specified in queries and finds data matches simultaneously with evaluating the probability of each data match. Also, we present an extended probabilistic XML query model with the predicates for the values of probabilities and a lightweight index for those probabilities in order to eliminate unnecessary access to data that will not be included in results.Experimental results show that our approach is efficient in probabilistic XML query processing and our index scheme significantly improves the performance of query processing when the predicates for the values of probabilities are given.

Read full abstract

We survey recent results on modeling and querying probabilistic XML data. The literature contains a plethora of probabilistic XML models [2, 13, 14, 18, 21, 24, 27], and most of them can be represented by means of p-documents [18] that have, in addition to ordinary nodes, distributional nodes that specify the probabilistic process of generating a random document. The above models are families of p-documents that differ in the types of distributional nodes in use. The focus of this survey is on the tradeoff between the ability to express real-world probabilistic data (in particular, by taking correlations between atomic events into account) and the efficiency of query evaluation. We concentrate on two important issues. The first is the ability to efficiently translate a pdocument of one family into that of another. The second is the complexity of query evaluation over pdocuments (under the usual semantics of querying probabilistic data, e.g., [4, 9, 10]). It turns out that efficient evaluation of a large class of queries (i.e., twig patterns with projection and aggregate functions) is realizable in models where distributional nodes are probabilistically independent. In other models, the evaluation of a query with projection is very often intractable. In comparison, very simple conjunctive queries are intractable over probabilistic models of relational databases, even when the tuples are probabilistically independent [9, 10]. To handle the limitation exhibited by the above tradeoff, various approaches have been proposed. The first is to allow query answers to be approximate [18], which makes the evaluation of twig patterns with projection tractable in the most expressive family of p-documents, among those considered. This tractability, however, does not carry over to nonmonotonic queries, such as twig patterns with negation or aggregation. The approach presented in [7]

Read full abstract

Probabilistic XML Data Research Articles

Related Topics

Articles published on Probabilistic XML Data

Keyword Search over Probabilistic XML Documents Based on Node Classification

Research on Basic Operations for Query Probabilistic XML Document Based on Path Set

ELCA evaluation for keyword search on probabilistic XML data

Efficient probabilistic XML query processing using an extended labeling scheme and a lightweight index

Efficient processing of top-k twig queries over probabilistic XML data

Research on Querying Node Probability Method in Probabilistic XML Data Based on Possible World

Modeling and querying probabilistic XML data

Probabilistic interval XML

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Probabilistic XML Data Research Articles

Related Topics

Articles published on Probabilistic XML Data

Keyword Search over Probabilistic XML Documents Based on Node Classification

Research on Basic Operations for Query Probabilistic XML Document Based on Path Set

ELCA evaluation for keyword search on probabilistic XML data

Efficient probabilistic XML query processing using an extended labeling scheme and a lightweight index

Efficient processing of top-k twig queries over probabilistic XML data

Research on Querying Node Probability Method in Probabilistic XML Data Based on Possible World

Modeling and querying probabilistic XML data

Probabilistic interval XML