KnowID: An Architecture for Efficient Knowledge-Driven Information and Data Access

Pablo Rubén Fillottrani,C Maria Keet

doi:10.1162/dint_a_00060

Pablo Rubén Fillottrani, C Maria Keet

Open Access

https://doi.org/10.1162/dint_a_00060

Copy DOI

Abstract

Modern information systems require the orchestration of ontologies, conceptual data modeling techniques, and efficient data management so as to provide a means for better informed decision-making and to keep up with new requirements in organizational needs. A major question in delivering such systems, is which components to design and put together to make up the required “knowledge to data” pipeline, as each component and process has trade-offs. In this paper, we introduce a new knowledge-to-data architecture, KnowID. It pulls together both recently proposed components and we add novel transformation rules between Enhanced Entity-Relationship (EER) and the Abstract Relational Model to complete the pipeline. KnowID's main distinctive architectural features, compared to other ontology-based data access approaches, are that runtime use can avail of the closed world assumption commonly used in information systems and of full SQL augmented with path queries.

Highlights

Traditional data management comprises a sequence from requirements to conceptual data model, transforming it into a relational model, and from there creating a physical database schema, at each stage leaving behind the artefact produced in the preceding step
We aim to extend this partial knowledge-to-data pipeline of Abstract Relational Model” (ARM)+SQL for path queries (SQLP) into the knowledge layer by adding a conceptual data model or application ontology and relevant related model management features, in such a way that the runtime use can remain within the closed world assumption and users still will be able to use full Structured Query Language (SQL)
We look into the Knowledge and Information Management box in Figure 3, with its four core subprocesses drawn above that, which depend on the precise input of “conceptual data model or application ontology C”

Summary

Introduction

Traditional data management comprises a sequence from requirements to conceptual data model, transforming it into a relational model, and from there creating a physical database schema, at each stage leaving behind the artefact produced in the preceding step. From the mid-2000s, theory, techniques, and tools started to advance to reuse the conceptual model at runtime in conjunction with large data stores to create a “knowledge to data” pipeline. This combination, or pipeline, is named with various terms, including ontology-based data access and (virtual) knowledge graphs [3]. If there are no canned queries and Juan cannot write Structured Query Language (SQL) or does not know in which database(s) the customers are stored, he either has to explore the data himself or ask a database administrator for help. It has been reported that data scientists spend at least 80% of their time on data discovery and integration [4]

Objectives

Methods

Results

Conclusion