Abstract

Binary relations are commonly used in Computer Science for modeling data. In addition to classical representations using matrices or lists, some compressed data structures have recently been proposed to represent binary relations in compact space, such as the $k^2$-tree and the Binary Relation Wavelet Tree (BRWT). Knowing their storage needs, supported operations and time performance is key for enabling an appropriate choice of data representation given a domain or application, its data distribution and typical operations that are computed over the data. In this work, we present an empirical comparison among several compressed representations for binary relations. We analyze their space usage and the speed of their operations using different (synthetic and real) data distributions. We include both neighborhood and set operations, also proposing algorithms for set operations for the BRWT, which were not presented before in the literature. We conclude that there is not a clear choice that outperforms the rest, but we give some recommendations of usage of each compact representation depending on the data distribution and types of operations performed over the data. We also include a scalability study of the data representations.

Highlights

  • Let A and B be two sets of objects

  • We present a comparison of three compact data structures that can be used to represent binary relations: k2-tree, k2-tree1 and BRWT (Binary Relation Wavelet Tree)

  • SET OPERATIONS OVER BRWT We describe the algorithms for computing union, intersection, difference, and symmetric difference of binary relations represented using BRWTs

Read more

Summary

INTRODUCTION

Let A and B be two sets of objects. A binary relation R is defined as a subset of the Cartesian product A × B, where for each element (a, b) ∈ R, we say that a is related to b and denote this as aRb. The comparison considers the same operations for all evaluated data structures They are set operations (union, intersection, difference, and symmetric difference) and primitive neighborhood operations (isRelated, successors, predecessors, and range neighborhood queries). The goal of this comparison is to facilitate the choice of the most accurate data structure for a particular application or domain. As an additional alternative to compact data structures, we include in our comparison the representation of the binary relations using compressed adjacency lists (using QMX and Rice-runs encoders) Another contribution of the current work is the design and implementation of all of the algorithms needed to perform the set operations and the neighborhood queries over binary relations represented with BRWT. The last section offers the overall discussion of the results and some conclusions of this work

PREVIOUS WORK
EMPIRICAL EVALUATION
CONCLUSION
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call