Abstract

Indexing intervals is a fundamental problem, finding a wide range of applications. Recent work on managing large collections of intervals in main memory focused on overlap joins and temporal aggregation problems. In this paper, we propose novel and efficient in-memory indexing techniques for intervals, with a focus on interval range queries, which are a basic component of many search and analysis tasks. First, we propose an optimized version of a single-level (flat) domain-partitioning approach, which may have large space requirements due to excessive replication. Then, we propose a hierarchical partitioning approach, which assigns each interval to at most two partitions per level and has controlled space requirements. Novel elements of our techniques include the division of the intervals at each partition into groups based on whether they begin inside or before the partition boundaries, reducing the information stored at each partition to the absolutely necessary, and the effective handling of data sparsity and skew. Experimental results on real and synthetic interval sets of different characteristics show that our approaches are typically one order of magnitude faster than the state-of-the-art.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.