Harmonizing the Metadata Among Diverse Climate Change Datasets

Andrã© Vellino

doi:10.2218/ijdc.v10i1.367

Abstract

One of the critical problems in the curation of research data is the harmonization of its internal metadata schemata. The value of harmonizing such data is well illustrated by the Berkeley Earth project, which successfully integrated into one metadata schema the raw climate datasets from a wide variety geographical sources and time periods (250 years). Doing this enabled climate scientists to calculate a more accurate estimate of the recent changes in Earthâ€™s average land surface temperatures and to ascertain the extent to which climate change is anthropogenic. This paper surveys some of the approaches that have been taken to the integration of data schemata in general and examines some of the specific metadata features of the source surface temperature datasets that were harmonized by Berkeley Earth. The conclusion drawn from this analysis is that the original source data and the Berkeley Earth common format provides a promising training set on which to apply machine learning methods for replicating the human data integration process. This paper describes research in progress on a domain-independent approach to the metadata harmonization problem that could be applied to other fields of study and be incorporated into a data portal to enhance the discoverability and reuse of data from a broad range of data sources.

Highlights

One of the critical features of a research data set is the metadata schema, sometimes referred to as the data format, that specifies the semantics for its data points
Some data obtained by researchers in one discipline, such as ecology, may be relevant to another discipline, such as climatology
This paper argues for the value and the feasibility of a machine-learning approach for addressing the data harmonization problem

Summary

Introduction

One of the critical features of a research data set is the metadata schema, sometimes referred to as the data format, that specifies the semantics for its data points. Given a suitably constructed ontology in a specific domain (e.g. the CIDOC Conceptual Reference Model, which provides a common semantic framework for cultural heritage information) it is possible to develop rule-based algorithms to generate candidate crosswalks between schemata (Gaitanou et al, 2012) This approach is only effective if the mediating translation schema is an adequate abstraction of the subject domain. Database schema matching systems and ontology integration systems typically rely on the known schemas source and target schemas to apply linguistic and rule-based approaches to perform the mapping Neither of these strategies generalizes well, in legacy data documentation environments whose interpretation is highly dependent on the software designed to read it, as is typically the case with climate datasets. That multiple crosswalks have been written for the same target metadata schema affords the opportunity to automate the mapping method with machine learning methods

A Machine Learning Approach to Mapping Schemata

Conclusions and Future Work

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: International Journal of Digital Curation	Publication Date: May 14, 2015
Citations: 8	License type: cc-by

R Discovery Prime

R Discovery Prime

Harmonizing the Metadata Among Diverse Climate Change Datasets

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: International Journal of Digital Curation

Lead the way for us

Similar Papers

Research Data Preservation Practices of Library and Information Science Faculties
Anuradha Maurya ... Subaveerapandiyan A
DESIDOC Journal of Library & Information Technology | VOL. 42
Anuradha Maurya, et. al.Anuradha Maurya ... Subaveerapandiyan A
19 Jul 2022
DESIDOC Journal of Library & Information Technology | VOL. 42

The Average Surface Temperature in Antarctica
Ruiyang Li
-
Ruiyang LiRuiyang Li
01 Jan 2015
01 Jan 2015

Urban and regional heat island adaptation measures in the Netherlands

A+BE: Architecture and the Built Environment | VOL. -

08 Dec 2017
A+BE: Architecture and the Built Environment | VOL. -

Urban and regional heat island adaptation measures in the Netherlands

A+BE: Architecture and the Built Environment | VOL. -

23 Nov 2017
A+BE: Architecture and the Built Environment | VOL. -

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Harmonizing the Metadata Among Diverse Climate Change Datasets

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: International Journal of Digital Curation