Abstract

With the move toward global, Internet enabled science there is an inherent need to capture, store, aggregate and search scientific data across a large corpus of heterogeneous data silos. As a result, standards development is needed to create an infrastructure capable of representing the diverse nature of scientific data. This paper describes a fundamental data model for scientific data that can be applied to data currently stored in any format, and an associated ontology that affords semantic representation of the structure of scientific data (and its metadata), upon which discipline specific semantics can be applied. Application of this data model to experimental and computational chemistry data are presented, implemented using JavaScript Object Notation for Linked Data. Full examples are available at the project website (Chalk in SciData: a scientific data model. http://stuchalk.github.io/scidata/, 2016).

Highlights

  • For almost 40 years, scientists have been storing scientific data on computers

  • This paper describes a generic scientific data model (SDM)/framework for scientific data derived from (1) the common structure of scientific articles, (2) the needs of electronic notebooks to capture scientific research data and metadata, and (3) the clear need to organize scientific data and its contextual descriptors

  • Considerations for a scientific data model What is scientific data? In order to appreciate what scientific data is we took a step back and looked at the scientific process to abstract the important aspects that underpin the framework of what scientists do and how they do it

Read more

Summary

Introduction

For almost 40 years, scientists have been storing scientific data on computers. With the advancement of Internet technologies and online and local storage capabilities, the options for collecting and stored scientific information have become unlimited. With all these advancements science faces an increasingly important issue of interoperability. Though the Internet has promoted the creation of open standards in many areas, scientific data has, in a sense, been left behind because of its inherent complexity. The problem is the contextualization of the scientific data—the metadata that describes system that it applies to, the way it was investigated, the scientists that determined it, and the quality of the measurements

Objectives
Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call