Efficient Aggregation Query Processing for Large-Scale Multidimensional Data by Combining RDB and KVS

Yuya Watari,Atsushi Keyaki,Jun Miyazaki,Masahide Nakamura

doi:10.1007/978-3-319-98809-2_9

Abstract

This paper presents a highly efficient aggregation query processing method for large-scale multidimensional data. Recent developments in network technologies have led to the generation of a large amount of multidimensional data, such as sensor data. Aggregation queries play an important role in analyzing such data. Although relational databases (RDBs) support efficient aggregation queries with indexes that enable faster query processing, increasing data size may lead to bottlenecks. On the other hand, the use of a distributed key-value store (D-KVS) is key to obtaining scale-out performance for data insertion throughput. However, querying multidimensional data sometimes requires a full data scan owing to its insufficient support for indexes. The proposed method combines an RDB and D-KVS to use their advantages complementarily. In addition, a novel technique is presented wherein data are divided into several subsets called grids, and the aggregated values for each grid are precomputed. This technique improves query processing performance by reducing the amount of scanned data. We evaluated the efficiency of the proposed method by comparing its performance with current state-of-the-art methods and showed that the proposed method performs better than the current ones in terms of query and insertion.

Full Text