Abstract
In recent years, with the emergence of a number of new real applications, such as protein-protein interaction (PPI) networks, visual pattern recognition, and intelligent traffic systems, managing huge volumes of uncertain graphs has attracted much attention in the database community. Currently, most existing fundamental queries over graphs only support deterministic (or certain) graphs, although real graph data are often noisy, inaccurate, and incomplete. In this paper, we study a new type of uncertain graph query, probabilistic supergraph containment query over large uncertain graphs. Specifically, given an uncertain graph database UGD which contains a set of uncertain graphs, a deterministic query graph q, and a probabilistic threshold δ, a probabilistic supergraph containment query is to find the set of uncertain graphs from UGD, denoted as UGDq, such that UGDq={ugi∈ UGD|Pr(ugi⊆ q)≥δ} where Pr(ugi⊆q) means the likelihood that ugi is a subgraph of q. We prove that the computation of Pr(ugi⊆q) is #P-hard and design an efficient filtering-and-verification framework to avoid the expensive computation. In particular, we propose an effective filtering strategy and a novel probabilistic inverted index, called PS-Index, to enhance pruning power in the filtering phase. Furthermore, the candidate graphs which pass the filtering phase are tested in the verification phase via an efficient unequal probability sampling-based approximation algorithm. Finally, we verify the effectiveness and efficiency of the proposed methods through extensive experiments.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.