A novel approach to assess and improve syntactic interoperability in data integration

Rihem Nasfi,Antoon Bronselaer,Guy De Tré

doi:10.1016/j.ipm.2023.103522

Abstract

Data integration is essential to enrich a database with external information. One effective approach is to match shared identifiers across diverse databases. However, a lack of syntactic interoperability, which refers to the ability to match data based on their syntax, can pose challenges. In this paper, we present a novel method to evaluate and enhance syntactic interoperability, considering associated costs. First, we introduce the linking index and completeness index as generic measures of fine-grained syntactic interoperability. Second, we analyze the data consistency level of the identifiers using a rule-based framework for data quality assessment. Third, we propose a data integration strategy that strikes a balance between fixing data inconsistencies and the resulting benefits, as measured by the linking and completeness indices. The approach is illustrated through two use cases: bibliographic databases and clinical trial registries. The results demonstrate that standardizing identifiers’ representations can significantly improve syntactic interoperability in certain scenarios while in others, the standardization process does not yield improvements, discouraging, hence integration decisions. By conducting a cost–benefit analysis of improving data interoperability, this analysis enables data integrators to make informed decisions regarding the feasibility and advantages of proceeding with data integration.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

A novel approach to assess and improve syntactic interoperability in data integration

Abstract

Talk to us

Similar Papers

More From: Information Processing & Management

Lead the way for us

Journal: Information Processing & Management	Publication Date: Oct 11, 2023
Citations: 1

Similar Papers

An ERP Data Quality Assessment Framework for the Implementation of an APS system using Bayesian Networks
Jan-Phillip Herrmann ... Jörg Böhme
Procedia Computer Science | VOL. 200
Jan-Phillip Herrmann, et. al.Jan-Phillip Herrmann ... Jörg Böhme
01 Jan 2021
Procedia Computer Science | VOL. 200

A Big Data Framework for Electric Power Data Quality Assessment
He Liu ... Weiwei Liu
-
He Liu, et. al.He Liu ... Weiwei Liu
01 Nov 2017
01 Nov 2017

Learning conflict resolution strategies for cross-language Wikipedia data fusion
Volha Bryl ... Christian Bizer
-
Volha Bryl, et. al.Volha Bryl ... Christian Bizer
07 Apr 2014
07 Apr 2014

A pragmatic and industry-oriented framework for data quality assessment of environmental footprint tools
Ramy Salemdeeb ... Fraser Millar
Resources, Environment and Sustainability | VOL. 3
Ramy Salemdeeb, et. al.Ramy Salemdeeb ... Fraser Millar
01 Mar 2021
Resources, Environment and Sustainability | VOL. 3

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A novel approach to assess and improve syntactic interoperability in data integration

Abstract

Talk to us

Similar Papers

More From: Information Processing &amp; Management

More From: Information Processing & Management