Space-Time Tradeoffs for Conjunctive Queries with Access Patterns

  • Abstract
  • References
  • Similar Papers
Abstract
Translate article icon Translate Article Star icon
Take notes icon Take Notes

In this article, we investigate space-time tradeoffs for answering conjunctive queries with access patterns (CQAPs). The goal is to create a space-efficient data structure in an initial preprocessing phase and use it for answering (multiple) queries in an online phase. Previous work has developed data structures that trades off space usage for answering time for queries of practical interest, such as the path and triangle query. However, these approaches lack a comprehensive framework and are not generalizable. Our main contribution is a general algorithmic framework for obtaining space-time tradeoffs for any CQAP. Our framework builds upon the PANDA algorithm and tree decomposition techniques. We demonstrate that our framework captures all state-of-the-art tradeoffs that were independently produced for various queries. Furthermore, we show surprising improvements over the state-of-the-art tradeoffs known in the existing literature for reachability queries.

ReferencesShowing 10 of 26 papers
  • Open Access Icon
  • Cite Count Icon 36
  • 10.1145/3035918.3035949
Extracting and Analyzing Hidden Graphs from Relational Databases
  • May 9, 2017
  • Konstantinos Xirogiannopoulos + 1 more

  • Cite Count Icon 106
  • 10.1007/978-3-540-74915-8_18
On Acyclic Conjunctive Queries and Constant Delay Enumeration
  • Dec 8, 2007
  • Guillaume Bagan + 2 more

  • Open Access Icon
  • Cite Count Icon 142
  • 10.1145/2535926
Tractable Hypergraph Properties for Constraint Satisfaction and Conjunctive Queries
  • Nov 1, 2013
  • Journal of the ACM
  • Dániel Marx

  • Open Access Icon
  • Cite Count Icon 85
  • 10.1145/3034786.3056105
What Do Shannon-type Inequalities, Submodular Width, and Disjunctive Datalog Have to Do with One Another?
  • May 9, 2017
  • Mahmoud Abo Khamis + 2 more

  • Open Access Icon
  • Cite Count Icon 91
  • 10.1145/3034786.3034789
Answering Conjunctive Queries under Updates
  • May 9, 2017
  • Christoph Berkholz + 2 more

  • Cite Count Icon 12
  • 10.1016/j.tcs.2015.03.026
On hardness of several string indexing problems
  • Mar 20, 2015
  • Theoretical Computer Science
  • Kasper Green Larsen + 3 more

  • Open Access Icon
  • Cite Count Icon 178
  • 10.1145/2902251.2902280
FAQ
  • Jun 15, 2016
  • Mahmoud Abo Khamis + 2 more

  • Open Access Icon
  • Cite Count Icon 19
  • 10.1017/cbo9781139177801.002
Treewidth and Hypertree Width
  • Feb 6, 2014
  • Georg Gottlob + 2 more

  • Cite Count Icon 78
  • 10.1109/focs.2010.83
Distance Oracles beyond the Thorup-Zwick Bound
  • Oct 1, 2010
  • Mihai Patrascu + 1 more

  • Open Access Icon
  • PDF Download Icon
  • Cite Count Icon 178
  • 10.4086/toc.2010.v006a005
Can You Beat Treewidth
  • Jan 1, 2010
  • Theory of Computing
  • Daniel Marx

Similar Papers
  • Conference Article
  • Cite Count Icon 2
  • 10.1145/3584372.3588675
Space-Time Tradeoffs for Conjunctive Queries with Access Patterns
  • Jun 18, 2023
  • Hangdong Zhao + 2 more

In this paper, we investigate space-time tradeoffs for answering conjunctive queries with access patterns (CQAPs). The goal is to create a space-efficient data structure in an initial preprocessing phase and use it for answering (multiple) queries in an online phase. Previous work has developed data structures that trades off space usage for answering time for queries of practical interest, such as the path and triangle query. However, these approaches lack a comprehensive framework and are not generalizable. Our main contribution is a general algorithmic framework for obtaining space-time tradeoffs for any CQAP. Our framework builds upon the $\PANDA$ algorithm and tree decomposition techniques. We demonstrate that our framework captures all state-of-the-art tradeoffs that were independently produced for various queries. Further, we show surprising improvements over the state-of-the-art tradeoffs known in the existing literature for reachability queries.

  • Book Chapter
  • Cite Count Icon 37
  • 10.1007/3-540-44503-x_15
On Answering Queries in the Presence of Limited Access Patterns
  • Jan 1, 2001
  • Chen Li + 1 more

Abstract. In information-integration systems, source relations often have limitations on access patterns to their data; i.e., when one must provide values for certain attributes of a relation in order to retrieve its tuples. In this paper we consider the following fundamental problem: can we compute the complete answer to a query by accessing the relations with legal patterns? The complete answer to a query is the answer that we could compute if we could retrieve all the tuples from the relations. We give algorithms for solving the problem for various classes of queries, including conjunctive queries, unions of conjunctive queries, and conjunctive queries with arithmetic comparisons. We prove the problem is undecidable for datalog queries. If the complete answer to a query cannot be computed, we often need to compute its maximal answer. The second problem we study is, given two conjunctive queries on relations with limited access patterns, how to test whether the maximal answer to the first query is contained in the maximal answer to the second one? We show this problem is decidable using the results of monadic programs.

  • Conference Article
  • Cite Count Icon 14
  • 10.1109/icde.2019.00052
Deletion Propagation for Multiple Key Preserving Conjunctive Queries: Approximations and Complexity
  • Apr 1, 2019
  • Zhipeng Cai + 2 more

This paper studies the deletion propagation problem in terms of minimizing view side-effect. It is a problem funda-mental to data lineage and quality management which could be a key step in analyzing view propagation and repairing data. The investigated problem is a variant of the standard deletion propagation problem, where given a source database D, a set of key preserving conjunctive queries Q, and the set of views V obtained by the queries in Q, we try to identify a set T of tuples from D whose elimination prevents all the tuples in a given set of deletions on views △V while preserving any other results. The complexity of this problem has been well studied for the case with only a single query. Dichotomies, even trichotomies, for different settings are developed. However, no results on multiple queries are given which is a more realistic case. We study the complexity and approximations of optimizing the side-effect on the views, i.e., find T to minimize the additional damage on V after removing all the tuples of △V. We focus on the class of key-preserving conjunctive queries which is a dichotomy for the single query case. It is surprising to find that except the single query case, this problem is NP-hard to approximate within any constant even for a non-trivial set of multiple project-free conjunctive queries in terms of view side-effect. The proposed algorithm shows that it can be approximated within a bound depending on the number of tuples of both V and △V. We identify a class of polynomial tractable inputs, and provide a dynamic programming algorithm to solve the problem. Besides data lineage, study on this problem could also provide important foundations for the computational issues in data repairing. Furthermore, we introduce some related applications of this problem, especially for query feedback based data cleaning.

  • Book Chapter
  • Cite Count Icon 451
  • 10.1007/3-540-45571-x_47
Mining Access Patterns Efficiently from Web Logs
  • Jan 1, 2000
  • Jian Pei + 3 more

With the explosive growth of data available on the World Wide Web, discovery and analysis of useful information from the World Wide Web becomes a practical necessity. Web access pattern, which is the sequence of accesses pursued by users frequently, is a kind of interesting and useful knowledge in practice. In this paper, we study the problem of mining access patterns from Web logs efficiently. A novel data structure, called Web access pattern tree, or WAP-tree in short, is developed for efficient mining of access patterns from pieces of logs. The Web access pattern tree stores highly compressed, critical information for access pattern mining and facilitates the development of novel algorithms for mining access patterns in large set of log pieces. Our algorithm can find access patterns from Web logs quite efficiently. The experimental and performance studies show that our method is in general an order of magnitude faster than conventional methods.

  • Conference Article
  • Cite Count Icon 30
  • 10.1109/infcom.2001.916640
A dynamic lookup scheme for bursty access patterns
  • Apr 22, 2001
  • F Ergun + 4 more

The problem of fast address lookup is crucial to routing and thus has received considerable attention. Most of the work in this field has focused on improving the speed of individual accesses-independent from the underlying access pattern. Gupta et al. (2000) proposed an efficient data structure to exploit the bias in access pattern. This technique achieves faster lookups for more frequently accessed keys while bounding the worst case lookup time; in fact it is (near) optimal under constraints on worst case performance. However,it needs to be rebuilt periodically to reflect the changes in access patterns, which can be inefficient for bursty environments. In this paper we introduce a new dynamic data structure to exploit biases in the access pattern, which tend to change dynamically. Previous work shows that there are many circumstances under which access patterns change quickly. Our data structure, which we call the biased skip list (BSL), has a self-update mechanism which reflects the changes in the access patterns efficiently and immediately, without any need for rebuilding. It improves throughput while keeping the worst case access time bounded by that of the fastest (unbiased) schemes. We demonstrate the practicality of BSL by experiments on data with varying degrees of burstiness.

  • Research Article
  • 10.46298/lmcs-21(2:23)2025
Conjunctive Queries with Free Access Patterns under Updates
  • Jun 16, 2025
  • Logical Methods in Computer Science
  • Ahmet Kara + 3 more

We study the problem of answering conjunctive queries with free access patterns (CQAPs) under updates. A free access pattern is a partition of the free variables of the query into input and output. The query returns tuples over the output variables given a tuple of values over the input variables. We introduce a fully dynamic evaluation approach that works for all CQAPs and is optimal for two classes of CQAPs. This approach recovers prior work on the dynamic evaluation of conjunctive queries without access patterns. We first give a syntactic characterisation of all CQAPs that admit constant time per single-tuple update and whose output tuples can be enumerated with constant delay given a tuple of values over the input variables. We further chart the complexity trade-off between the preprocessing time, update time and enumeration delay for a class of CQAPs. For some of these CQAPs, our approach achieves optimal, albeit non-constant, update time and delay. This optimality is predicated on the Online Matrix-Vector Multiplication conjecture. We finally adapt our approach to the dynamic evaluation of tractable CQAPs over probabilistic databases under updates.

  • Research Article
  • Cite Count Icon 76
  • 10.1007/s00778-002-0085-6
Computing complete answers to queries in the presence of limited access patterns
  • Oct 1, 2003
  • The VLDB Journal The International Journal on Very Large Data Bases
  • Chen Li

Abstract.In data applications such as information integration, there can be limited access patterns to relations, i.e., binding patterns require values to be specified for certain attributes in order to retrieve data from a relation. As a consequence, we cannot retrieve all tuples from these relations. In this article we study the problem of computing the complete answer to a query, i.e., the answer that could be computed if all the tuples could be retrieved. A query is stable if for any instance of the relations in the query, its complete answer can be computed using the access patterns permitted by the relations. We study the problem of testing stability of various classes of queries, including conjunctive queries, unions of conjunctive queries, and conjunctive queries with arithmetic comparisons. We give algorithms and complexity results for these classes of queries. We show that stability of datalog programs is undecidable, and give a sufficient condition for stability of datalog queries. Finally, we study data-dependent computability of the complete answer to a nonstable query, and propose a decision tree for guiding the process to compute the complete answer.

  • Book Chapter
  • Cite Count Icon 12
  • 10.1007/3-540-44808-x_18
Biased Skip Lists for Highly Skewed Access Patterns
  • Jan 1, 2001
  • Funda Ergun + 3 more

Dynamic tables that support search, insert and delete operations are fundamental and well studied in computer science. There are many well known data structures that solve this problem, including balanced binary trees, skip lists and tries among others. Many of the existing data structures work efficiently when the access patterns are uniform, but in many circumstance access patterns are biased. Various data structures have been proposed that exploit bias in access patterns to improve efficiency for the operations they support.In this paper we introduce a new data structure, the biased skip list (BSL), which is designed to work with biased access distributions. Specifically, given key k, let its rank r(k) be the number of distinct keys accessed since the last access to k. BSL enables one to search for k in O(logr(k)) expected time. Insertions and deletions take O(logr max (k)) expected time where r max (k) denotes the maximum rank of k during its lifespan.Our work is motivated by recent studies on packet filtering and classification where keys have been found to have geometric (or more skewed) access probabilities as a function of how recently they have been accessed. We demonstrate the practicality of BSL with experiments on real and synthetic data with various degrees of bias.KeywordsSearch TimeAverage RankAccess PatternMaximum RankExpected TimeThese keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

  • Book Chapter
  • Cite Count Icon 4
  • 10.1007/978-3-642-04930-9_10
A Decomposition-Based Approach to Optimizing Conjunctive Query Answering in OWL DL
  • Jan 1, 2009
  • Jianfeng Du + 3 more

Scalable query answering over Description Logic (DL) based ontologies plays an important role for the success of the Semantic Web. Towards tackling the scalability problem, we propose a decomposition-based approach to optimizing existing OWL DL reasoners in evaluating conjunctive queries in OWL DL ontologies. The main idea is to decompose a given OWL DL ontology into a set of target ontologies without duplicated ABox axioms so that the evaluation of a given conjunctive query can be separately performed in every target ontology by applying existing OWL DL reasoners. This approach guarantees sound and complete results for the category of conjunctive queries that the applied OWL DL reasoner correctly evaluates. Experimental results on large benchmark ontologies and benchmark queries show that the proposed approach can significantly improve scalability and efficiency in evaluating general conjunctive queries.

  • Book Chapter
  • Cite Count Icon 7
  • 10.1007/3-540-40992-0_18
On the Hardness of Learning Acyclic Conjunctive Queries
  • Jan 1, 2000
  • Kouichi Hirata

A conjunctive query problem in relational database theory is a problem to determine whether or not a tuple belongs to the answer of a conjunctive query over a database. Here, a tuple and a conjunctive query are regarded as a ground atom and a nonrecursive function-free definite clause, respectively. While the conjunctive query problem is NP-complete in general, it becomes efficiently solvable if a conjunctive query is acyclic. Concerned with this problem, we investigate the learnability of acyclic conjunctive queries from an instance with a j-database which is a finite set of ground unit clauses containing at most j-ary predicate symbols. We deal with two kinds of instances, a simple instance as a set of ground atoms and an extended instance as a set of pairs of a ground atom and a description. Then, we show that, for each j ≥ 3, there exist a j-database such that acyclic conjunctive queries are not polynomially predictable from an extended instance under the cryptographic assumptions. Also we show that, for each n > 0 and a polynomial p, there exists a p(n)- database of size O(2p(n)) such that predicting Boolean formulae of size p(n) over n variables reduces to predicting acyclic conjunctive queries from a simple instance. This result implies that, if we can ignore the size of a database, then acyclic conjunctive queries are not polynomially predictable from a simple instance under the cryptographic assumptions. Finally, we show that, if either j = 1, or j = 2 and the number of element of a database is at most l (≥ 0), then acyclic conjunctive queries are paclearnable from a simple instance with j-databases.

  • Conference Article
  • Cite Count Icon 23
  • 10.1145/3196959.3196979
Compressed Representations of Conjunctive Query Results
  • May 27, 2018
  • Shaleen Deep + 1 more

Relational queries, and in particular join queries, often generate large output results when executed over a huge dataset. In such cases, it is often infeasible to store the whole materialized output if we plan to reuse it further down a data processing pipeline. Motivated by this problem, we study the construction of space-efficient compressed representations of the output of conjunctive queries, with the goal of supporting the efficient access of the intermediate compressed result for a given access pattern. In particular, we initiate the study of an important tradeoff: minimizing the space necessary to store the compressed result, versus minimizing the answer time and delay for an access request over the result. Our main contribution is a novel parameterized data structure, which can be tuned to trade off space for answer time. The tradeoff allows us to control the space requirement of the data structure precisely, and depends both on the structure of the query and the access pattern. We show how we can use the data structure in conjunction with query decomposition techniques in order to efficiently represent the outputs for several classes of conjunctive queries.

  • Book Chapter
  • Cite Count Icon 8
  • 10.1007/978-3-319-78102-0_3
Counting and Conjunctive Queries in the Lifted Junction Tree Algorithm
  • Jan 1, 2018
  • Tanya Braun + 1 more

Standard approaches for inference in probabilistic formalisms with first-order constructs include lifted variable elimination (LVE) for single queries. To handle multiple queries efficiently, the lifted junction tree algorithm (LJT) uses a first-order cluster representation of a knowledge base and LVE in its computations. We extend LJT with a full formal specification of its algorithm steps incorporating (i) the lifting tool of counting and (ii) answering of conjunctive queries. Given multiple queries, e.g., in machine learning applications, our approach enables us to compute answers faster than the current LJT and existing approaches tailored for single queries.

  • Conference Article
  • Cite Count Icon 91
  • 10.1145/3034786.3034789
Answering Conjunctive Queries under Updates
  • May 9, 2017
  • Christoph Berkholz + 2 more

We consider the task of enumerating and counting answers to k-ary conjunctive queries against relational databases that may be updated by inserting or deleting tuples. We exhibit a new notion of q-hierarchical conjunctive queries and show that these can be maintained efficiently in the following sense. During a linear time pre-processing phase, we can build a data structure that enables constant delay enumeration of the query results; and when the database is updated, we can update the data structure and restart the enumeration phase within constant time. For the special case of self-join free conjunctive queries we obtain a dichotomy: if a query is not q-hierarchical, then query enumeration with sublinear *) delay and sublinear update time (and arbitrary preprocessing time) is impossible.For answering Boolean conjunctive queries and for the more general problem of counting the number of solutions of k-ary queries we obtain complete dichotomies: if the query's homomorphic core is q-hierarchical, then size of the the query result can be computed in linear time and maintained with constant update time. Otherwise, the size of the query result cannot be maintained with sublinear update time.All our lower bounds rely on the OMv-conjecture, a conjecture on the hardness of online matrix-vector multiplication that has recently emerged in the field of fine-grained complexity to characterise the hardness of dynamic problems. The lower bound for the counting problem additionally relies on the orthogonal vectors conjecture, which in turn is implied by the strong exponential time hypothesis.*) By sublinear we mean O(n(1-e) for some e > 0, where n is the size of the active domain of the current database.

  • Research Article
  • Cite Count Icon 1
  • 10.1016/j.tcs.2005.09.006
Prediction-hardness of acyclic conjunctive queries
  • Sep 28, 2005
  • Theoretical Computer Science
  • Kouichi Hirata

Prediction-hardness of acyclic conjunctive queries

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 2
  • 10.5194/isprsarchives-xl-4-133-2014
A replacement strategy for a distributed caching system based on the spatiotemporal access pattern of geospatial data
  • Apr 23, 2014
  • The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences
  • R Li + 2 more

Abstract. Cache replacement strategy is the core for a distributed high-speed caching system, and effects the cache hit rate and utilization of a limited cache space directly. Many reports show that there are temporal and spatial local changes in access patterns of geospatial data, and there are popular hot spots which change over time. Therefore, the key issue for cache replacement strategy for geospatial data is to get a combination method which considers both temporal local changes and spatial local changes in access patterns, and balance the relationship between the changes. And the cache replacement strategy should fit the distribution and changes of hotspot. This paper proposes a cache replacement strategy based on access pattern which have access spatiotemporal localities. Firstly, the strategy builds a method to express the access frequency and the time interval for geospatial data access based on a least-recently-used replacement (LRU) algorithm and its data structure; secondly, considering both the spatial correlation between geospatial data access and the caching location for geospatial data, it builds access sequences based on a LRU stack, which reflect the spatiotemporal locality changes in access pattern. Finally, for achieving the aim of balancing the temporal locality and spatial locality changes in access patterns, the strategy chooses the replacement objects based on the length of access sequences and the cost of caching resource consumption. Experimental results reveal that the proposed cache replacement strategy is able to improve the cache hit rate while achieving a good response performance and higher system throughput. Therefore, it can be applied to handle the intensity of networked GISs data access requests in a cloud-based environment.

More from: ACM Transactions on Database Systems
  • New
  • Research Article
  • 10.1145/3771733
Tuple-Independent Representations of Infinite Probabilistic Databases
  • Nov 6, 2025
  • ACM Transactions on Database Systems
  • Nofar Carmeli + 3 more

  • New
  • Research Article
  • 10.1145/3774753
Update NDP: On Offloading Modifications to Smart Storage with Transactional Guarantees in Near-Data Processing DBMS
  • Nov 4, 2025
  • ACM Transactions on Database Systems
  • Arthur Bernhardt + 4 more

  • Research Article
  • 10.1145/3774316
Uniform Operational Consistent Query Answering
  • Nov 1, 2025
  • ACM Transactions on Database Systems
  • Marco Calautti + 3 more

  • Research Article
  • 10.1145/3716378
Degree Sequence Bounds
  • Oct 25, 2025
  • ACM Transactions on Database Systems
  • Kyle Deeds + 3 more

  • Research Article
  • 10.1145/3771766
Saga++: A Scalable Framework for Optimizing Data Cleaning Pipelines for Machine Learning Applications
  • Oct 14, 2025
  • ACM Transactions on Database Systems
  • Shafaq Siddiqi + 3 more

  • Research Article
  • 10.1145/3770577
Efficient Path Oracles for Proximity Queries on Point Clouds
  • Oct 2, 2025
  • ACM Transactions on Database Systems
  • Yinzhao Yan + 1 more

  • Research Article
  • 10.1145/3734517
Any-k Algorithms for Enumerating Ranked Answers to Conjunctive Queries
  • Sep 30, 2025
  • ACM Transactions on Database Systems
  • Nikolaos Tziavelis + 2 more

  • Research Article
  • 10.1145/3760773
BISLearner: Block-Aware Index Selection using Attention-Based Reinforcement Learning for Data Analytics
  • Sep 29, 2025
  • ACM Transactions on Database Systems
  • Yulai Tong + 7 more

  • Research Article
  • 10.1145/3764583
Unveiling Logic Bugs in SPJG Query Optimizations within DBMS
  • Sep 29, 2025
  • ACM Transactions on Database Systems
  • Xiu Tang + 6 more

  • Research Article
  • 10.1145/3743130
Space-Time Tradeoffs for Conjunctive Queries with Access Patterns
  • Jul 26, 2025
  • ACM Transactions on Database Systems
  • Hangdong Zhao + 2 more

Save Icon
Up Arrow
Open/Close
  • Ask R Discovery Star icon
  • Chat PDF Star icon

AI summaries and top papers from 250M+ research sources.

Search IconWhat is the difference between bacteria and viruses?
Open In New Tab Icon
Search IconWhat is the function of the immune system?
Open In New Tab Icon
Search IconCan diabetes be passed down from one generation to the next?
Open In New Tab Icon