Abstract

Data interoperability is an ongoing challenge for global open data initiatives. The machine-readable specification of data types for datasets will help address interoperability issues. Data types have typically been at the syntactical level such as integer, float and string, etc. in programming languages. The work presented in this paper is a model design for the semantic specification of data types, such as a topographic map. The work was conducted in the context of the Semantic Web. The model differentiates the semantic data type from the basic data type. The former are instances (e.g., topographic map) of a specific data type class that is defined in the developed model. The latter are classes (e.g., Image) of resource types in existing ontologies. A data resource is an instance of a basic data type and is tagged with one or more specific data types. The implementation of the model is given within an existing production data portal that enables one to register specific data types and use them to annotate data resources. Data users can obtain explicating assumptions or information inherent in a dataset through the specific data types of that dataset. The machine-readable information of specific data types also paves the way for further studies, such as dataset recommendation.

Highlights

  • The aim of this paper is to present our work of a conceptual model for the semantic specification of data types, as well as the implementation of the model in an existing production data portal for a decadal international science program: the Deep Carbon Observatory [10]

  • When we described the source data types, source standards, and creators of a data type above, we partly talked about the components of provenance

  • In order to link those ontologies to the provenance parts in the designed model for data type, we asserted a few existing classes as subclasses of corresponding classes in the PROV-O Ontology

Read more

Summary

Introduction

The efforts on mechanisms of data publication [1], data cataloging [2], data citation [3], and alternative metrics [4] are incubating a new socio-technical system that promotes both the culture and the practice of open data. Within such a system, data are going to be shared and reused across the boundaries of nations, sectors, disciplines, repositories, and formats, as well as between levels of details. Among the various metadata elements available, such as those in the Dublin Core Metadata Elements [6] and the DataCite

Objectives
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call