Abstract

Composing queries is evidently a tedious task. This is particularly true of graph queries as they are typically complex and prone to errors, compounded by the fact that graph schemas can be missing or too loose to be helpful for query formulation. Despite the great success of query formulation aids, in particular, automatic query completion, graph query autocompletion has received much less research attention. In this paper, we propose a novel framework for subgraph query autocompletion (called AutoG). Given an initial query q and a user’s preference as input, AutoG returns ranked query suggestions $$Q'$$ as output. Users may choose a query from $$Q'$$ and iteratively apply AutoG to compose their queries. The novelties of AutoG are as follows: First, we formalize query composition. Second, we propose to increment a query with the logical units called c-prime features that are (i) frequent subgraphs and (ii) constructed from smaller c-prime features in no more than c ways. Third, we propose algorithms to rank candidate suggestions. Fourth, we propose a novel index called feature Dag (FDag) to optimize the ranking. We study the query suggestion quality with simulations and real users and conduct an extensive performance evaluation. The results show that the query suggestions are useful (saved roughly 40% of users’ mouse clicks), and AutoG returns suggestions shortly under a large variety of parameter settings.

Highlights

  • The prevalence of graph-structured data in modern real-world applications such as biological and chemical databases (e.g., PubChem), and co-purchase networks (e.g., Amazon.com) has lead to a rejuvenation of research on graph data management and analytics

  • To optimize ranked subgraph query suggestion problem (RSQ), we propose a novel index for c-prime features, called feature DAG (FDAG)

  • The time complexity of Algo. 1 is O(|features of the query (Fq)| ×Tsubiso + |E|2 × |MFq |), where (a) the first term is the time for determining the embeddings of Fq in q and Tsubiso is the time for a subgraph isomorphism call, and (b) the second term is for scanning the |MFq | embeddings to cover O(|E|) edges in the FIND function, which is invoked O(|E|) times

Read more

Summary

Introduction

The prevalence of graph-structured data in modern real-world applications such as biological and chemical databases (e.g., PubChem), and co-purchase networks (e.g., Amazon.com) has lead to a rejuvenation of research on graph data management and analytics. Chemists are not often expected to learn the complex syntax of a graph query language in order to formulate meaningful queries over a chemical compound database such as PubChem or eMolecule.. There has been increasing efforts to create such user-friendly GUIs from academia [18] and industry (e.g., PubChem and eMolecule) to ease the burden of query formulation. Given a partiallyconstructed visual subgraph query, it is always desirable to suggest top-k possible query fragments that the user may potentially add to his/her intermediate query in the subsequent steps.

Review suggestions
Subgraph queries and background
Preliminaries
Query composition
C CC f24 C 16
Query decomposition
1: Let Fq be the c-prime features of q
Complexity analysis of query decomposition
Ranking candidate suggestions
Ranking function and user preference component
Efficient selectivity and diversity computation
Greedy ranking algorithm
Complexity analysis of the greedy ranking algorithm
Autocompletion by using FDAG
Pruning redundant compositions via graph automorphism
Experimental Evaluation
Suggestion quality
Index construction performance
Online Autocomplete performance
Related Work
Conclusion
A Properties of c-prime features
Suggestion qualities with different underlying definitions in AUTOG
Online performance breakdowns
Findings
D The FDAG construction

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.