Abstract

Research of automatic integration of structured and semi-structured data has not resulted in success over the past fifty years. No theory of data integration exists. It is unknown what the theoretical necessary requirements are, to fully support automatic data integration from autonomous heterogeneous data sources. Therefore, it is not possible to objectively evaluate if and how much new algorithms, techniques, and specifically Data Definition Languages, move towards meeting such theoretical requirements. To overcome the serious reverse salient the field and industry are in, it will be helpful if a data integration theory would be developed. This article proposes a new look at data integration by using complex adaptive systems principles to analyze current shortcomings and propose a direction that may lead to a data integration theory.

Highlights

  • This article proposes a new look at data integration by using complex adaptive systems principles to analyze current shortcomings and propose a direction that may lead to a data integration theory

  • Examples of Data Definition Languages (DDLs) include Cobol’s structured File Description (FD) section; delimited flat files such as Comma Separated Values (CSV) and Data Interchange File Format (DIFF) for data exchange; Structured Query Language (SQL) for relational databases; Extensible Markup Language (XML) for semi-structured data and metadata; ontologies expressed in a variety of DDLs such as Resource Description Framework (RDF) and Web Ontology Language (OWL)

  • It too references data integration techniques that are two decades old, none of which produced a data integration solution that doesn’t require substantial human intervention. It is worth taking the risk and look at DDL engineering geared towards data integration from an entirely different perspective

Read more

Summary

Motivation and Introduction

Data integration is a pervasive challenge faced in applications that need to query across multiple autonomous and heterogeneous data sources. It is a major challenge for companies experiencing mergers or acquisitions. It is not possible to objectively evaluate if and how much new algorithms, techniques, and Data Definition Languages (DDLs), move towards meeting the requirements of automatic data integration. Nor is it possible to suggest a better algorithm, technique or DDL that might advance the state-of-the-art of automatic data integration, because the requirements do not exist

Data Structures
Data Integration Approaches
Variety
Tension
Entropy
Law of Requisite Variety
Data Integration and CAS Principles
Desired Attributes
Summary
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call