Distributed secondo: an extensible and scalable database management system

Jan Kristof Nidzwetzki,Ralf Hartmut Güting

doi:10.1007/s10619-017-7198-9

Jan Kristof Nidzwetzki, Ralf Hartmut Güting

https://doi.org/10.1007/s10619-017-7198-9

Copy DOI

Export

Save

Cite

Abstract
Full-Text
Similar Papers

Abstract

Listen

This paper describes a novel method to couple a standalone database management system (DBMS) with a highly scalable key-value store. The system employs Apache Cassandra as data storage and the extensible DBMS Secondo as a query processing engine. The resulting system is a distributed, general-purpose DBMS which is highly scalable and fault tolerant. The logical ring of Cassandra is used to split up input data into smaller units of work (UOWs), which can be processed independently. A decentralized algorithm is responsible to assign the UOWs to query processing nodes. In case of a node failure, UOWs are recalculated on a different node. All the data models (e.g. relational, spatial and spatio-temporal) and functions (e.g. filter, aggregates, joins and spatial-joins) implemented in Secondo can be used in a scalable way without changing the implementation. Many aspects of the distribution are hidden from the user. Existing sequential queries can be easily converted into parallel ones.

Full Text