A Column Styled Composable Schema Matcher for Semantic Data-Types

Xiaofeng Liao,Jordy Bottelier,Zhiming Zhao

doi:10.5334/dsj-2019-025

Abstract

Schema matching exists as a long-standing challenge in many database related applications, such as data integration, where two databases with different schema have to be integrated. With the evolvement from database to big data, the schema matching has been enriched with various purposes and application contexts, ranging from data integration, to service integration, to semantic data clouding, until more recent exploratory data analysis over big data. These enriched contexts increase the demand for schema matching between semantic data-types, such as XML, RDF etc. The existing integration approaches have not dealt with the challenges of defining a relation between XML and other semantic data-types. To address these challenges, this paper studies the problem of schema mapping from XML to RDF in two folds. Firstly, testify the validity of single matcher in a column based manner for the semantic data types. Secondly, testify the validity of a highly configurable framework that utilizes hierarchical classification in order to construct a composable pipeline. We propose and implement a Reconfigurable pipeline for Semi-Automatic Schema Matching (REPSASM), which aims to solve the customizability of the matching problem by providing an environment in which a user can create, configure and experiment with their own schema-matching procedure. The experiments performed within this work show that the configurability and hierarchical classification improves the matching result, and it proposes an algorithm to automatically optimize such a hierarchy pipeline.

Highlights

Schema matching exists as a principle problem in many database related applications, such as data integration, where two databases with different schema have to be integrated
These enriched contexts increase the demand for schema matching between semantic data-types, such as XML, RDF etc
We propose and implement an reconfigurable pipeline for Semi-Automatic Schema Matching (REPSASM), in this context as a chain of matchers that is used to classify data

Summary

A Column Styled Composable Schema Matcher for Semantic Data-Types

Introduction

Related Work

Architecture

Experiments

Dataset

Metrics

Experiment 1

Experiment 2

Findings

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Data Science Journal	Publication Date: Jun 24, 2019
Citations: 6	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

A Column Styled Composable Schema Matcher for Semantic Data-Types

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Data Science Journal

Lead the way for us

Similar Papers

The Interaction Between Schema Matching and Record Matching in Data Integration
Binbin Gu ... An Liu
IEEE Transactions on Knowledge and Data Engineering | VOL. 29
Binbin Gu, et. al.Binbin Gu ... An Liu
01 Jan 2017
IEEE Transactions on Knowledge and Data Engineering | VOL. 29

Coping with Uncertainty in Schema Matching: Bayesian Networks and Agent-Based Modeling Approach
Hicham Assoudi ... Hakim Lounis
-
Hicham Assoudi, et. al.Hicham Assoudi ... Hakim Lounis
01 Jan 2015
01 Jan 2015

Semantic-Similarity-Based Schema Matching for Management of Building Energy Data
Zhiyu Pan ... Guanchen Pan
Energies | VOL. 15
Zhiyu Pan, et. al.Zhiyu Pan ... Guanchen Pan
24 Nov 2022
Energies | VOL. 15

Effect of thesaurus size on schema matching quality
Thabit Sabbah ... Tutut Herawan
Knowledge-Based Systems | VOL. 71
Thabit Sabbah, et. al.Thabit Sabbah ... Tutut Herawan
16 Aug 2014
Knowledge-Based Systems | VOL. 71

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A Column Styled Composable Schema Matcher for Semantic Data-Types

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Data Science Journal