A unified metamodel for NoSQL and relational databases

Carlos J Fernández Candel,Diego Sevilla Ruiz,Jesús J García-Molina

doi:10.1016/j.is.2021.101898

Carlos J Fernández Candel, Diego Sevilla Ruiz + Show 1 more

Open Access

https://doi.org/10.1016/j.is.2021.101898

Copy DOI

Journal: Information Systems	Publication Date: Sep 25, 2021
Citations: 29	License type: cc-by-nc-nd

Affiliation: University of Murcia

Abstract

The Database field is undergoing significant changes. Although relational systems are still predominant, the interest in NoSQL systems is continuously increasing. In this scenario, polyglot persistence is envisioned as the database architecture to be prevalent in the future. Therefore, database tools and systems are evolving to support several data models.Multi-model database tools normally use a generic or unified metamodel to represent schemas of the data model that they support. Such metamodels facilitate developing database utilities, as they can be built on a common representation. Also, the number of mappings required to migrate databases from a data model to another is reduced, and integrability is favored.In this paper, we present the U-Schema unified metamodel able to represent logical schemas for the four most popular NoSQL paradigms (columnar, document, key–value, and graph) as well as relational schemas. We will formally define the mappings between U-Schema and the data model defined for each database paradigm. How these mappings have been implemented and validated will be discussed, and some applications of U-Schema will be shown.To achieve flexibility to respond to data changes, most of NoSQL systems are “schema-on-read,” and the declaration of schemas is not required. Such an absence of schema declaration makes structural variability possible, i.e., stored data of the same entity type can have different structure. Moreover, data relationships supported by each data model are different; For example, document stores have aggregate objects but not relationship types, whereas graph stores offer the opposite. Through the paper, we will show how all these issues have been tackled in our approach.As far as we know, no proposal exists in the literature of a unified metamodel for relational and the NoSQL paradigms which describes how each individual data model is integrated and mapped. Our metamodel goes beyond the existing proposals by distinguishing entity types and relationship types, representing aggregation and reference relationships, and including the notion of structural variability. Our contributions also include developing schema extraction strategies for schemaless systems of each NoSQL data model, and tackling performance and scalability in the implementation for each store.

Full Text