Relational Query Processing Research Articles

Emerging applications in Internet of Things (IoT) and Cyber–Physical Systems (CPS) present novel challenges to Big Data platforms for performing online analytics. Ubiquitous sensors from IoT deployments are able to generate data streams at high velocity, that include information from a variety of domains, and accumulate to large volumes on disk. Complex Event Processing (CEP) is recognized as an important real-time computing paradigm for analyzing continuous data streams. However, existing work on CEP is largely limited to relational query processing, exposing two distinctive gaps for query specification and execution: (1) infusing the relational query model with higher level knowledge semantics, and (2) seamless query evaluation across temporal spaces that span past, present and future events. These allow accessible analytics over data streams having properties from different disciplines, and help span the velocity (real-time) and volume (persistent) dimensions. In this article, we introduce a Knowledge-infused CEP (χ-CEP ) framework that provides domain-aware knowledge query constructs along with temporal operators that allow end-to-end queries to span across real-time and persistent streams. We translate this query model to efficient query execution over online and offline data streams, proposing several optimizations to mitigate the overheads introduced by evaluating semantic predicates and in accessing high-volume historic data streams. In particular, we also address temporal consistency issues that arise during fault recovery of query plans that span the boundary between real-time and persistent streams. The proposed χ-CEP query model and execution approaches are implemented in our prototype semantic CEP engine, SCEPter. We validate our query model using domain-aware CEP queries from a real-world Smart Power Grid application, and experimentally analyze the benefits of our optimizations for executing these queries, using event streams from a campus-microgrid IoT deployment. Our results show that we are able to sustain a processing throughput of 3,000 events/secs for χ-CEP queries, a 30× improvement over the baseline and sufficient to support a Smart Township, and can resume consistent processing within 20 secs after stream outages as long as 2 hours.

Read full abstract

Graphics processors (GPUs) have recently emerged as powerful coprocessors for general purpose computation. Compared with commodity CPUs, GPUs have an order of magnitude higher computation power as well as memory bandwidth. Moreover, new-generation GPUs allow writes to random memory locations, provide efficient interprocessor communication through on-chip local memory, and support a general purpose parallel programming model. Nevertheless, many of the GPU features are specialized for graphics processing, including the massively multithreaded architecture, the Single-Instruction-Multiple-Data processing style, and the execution model of a single application at a time. Additionally, GPUs rely on a bus of limited bandwidth to transfer data to and from the CPU, do not allow dynamic memory allocation from GPU kernels, and have little hardware support for write conflicts. Therefore, a careful design and implementation is required to utilize the GPU for coprocessing database queries. In this article, we present our design, implementation, and evaluation of an in-memory relational query coprocessing system, GDB, on the GPU. Taking advantage of the GPU hardware features, we design a set of highly optimized data-parallel primitives such as split and sort, and use these primitives to implement common relational query processing algorithms. Our algorithms utilize the high parallelism as well as the high memory bandwidth of the GPU, and use parallel computation and memory optimizations to effectively reduce memory stalls. Furthermore, we propose coprocessing techniques that take into account both the computation resources and the GPU-CPU data transfer cost so that each operator in a query can utilize suitable processors—the CPU, the GPU, or both—for an optimized overall performance. We have evaluated our GDB system on a machine with an Intel quad-core CPU and an NVIDIA GeForce 8800 GTX GPU. Our workloads include microbenchmark queries on memory-resident data as well as TPC-H queries that involve complex data types and multiple query operators on data sets larger than the GPU memory. Our results show that our GPU-based algorithms are 2--27x faster than their optimized CPU-based counterparts on in-memory data. Moreover, the performance of our coprocessing scheme is similar to, or better than, both the GPU-only and the CPU-only schemes.

Read full abstract

Relational Query Processing Research Articles

Related Topics

Articles published on Relational Query Processing

Technology-Enabled Database Education: Challenges andOpportunities

Technical Perspective: Conjunctive Queries with Comparisons

An Efficient Schema Transformation Technique for Data Migration from Relational to Column-Oriented Databases

Effective and efficient skyline query processing over attribute-order-preserving-free encrypted data in cloud-enabled databases

DIFF: a relational interface for large-scale data explanation

Data-Trace Types for Distributed Stream Processing Systems.

DIFF

Knowledge-infused and consistent Complex Event Processing over real-time and persistent streams

Factorized Databases

OmniDB

Query Processing in Private Data Outsourcing Using Anonymization

Querying probabilistic information extraction

Relational processing of RDF queries

Relational query coprocessing on graphics processors

XML

Correlation maps

Index structures for matching XML twigs using relational query processors

Supporting top-k join queries in relational databases

Algebraic equivalences of nested relational operators

TIMBER: A native XML database

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Relational Query Processing Research Articles

Related Topics

Articles published on Relational Query Processing

Technology-Enabled Database Education: Challenges andOpportunities

Technical Perspective: Conjunctive Queries with Comparisons

An Efficient Schema Transformation Technique for Data Migration from Relational to Column-Oriented Databases

Effective and efficient skyline query processing over attribute-order-preserving-free encrypted data in cloud-enabled databases

DIFF: a relational interface for large-scale data explanation

Data-Trace Types for Distributed Stream Processing Systems.

DIFF

Knowledge-infused and consistent Complex Event Processing over real-time and persistent streams

Factorized Databases

OmniDB

Query Processing in Private Data Outsourcing Using Anonymization

Querying probabilistic information extraction

Relational processing of RDF queries

Relational query coprocessing on graphics processors

XML

Correlation maps

Index structures for matching XML twigs using relational query processors

Supporting top-k join queries in relational databases

Algebraic equivalences of nested relational operators

TIMBER: A native XML database