Enabling Data Reuse Through Semantic Enrichment of Instrumentation

Robert Huber,Marianne Rehage,Tina Dohna,Egor Gordeev,Anusuriya Devaraju,Roland Koppe,Uwe Schindler,Michael Diepenbroek

doi:10.5194/egusphere-egu2020-7058

Abstract

&lt;p&gt;Pressing environmental and societal challenges demand the reuse of data on a much larger scale. Central to improvements on this front are approaches that support structured and detailed data descriptions of published data. In general, the reusability of scientific datasets such as measurements generated by instruments, observations collected in the field, and model simulation outputs, require information about the contexts through which they were produced. These contexts include the instrumentation, methods, and analysis software used. In current data curation practice, data providers often put a significant effort in capturing descriptive metadata about datasets. Nonetheless, metadata about instruments and methods provided by data authors are limited, and in most cases are unstructured.&lt;/p&gt;&lt;p&gt;The &amp;#8216;Interoperability&amp;#8217; principle of FAIR emphasizes the importance of using formal vocabularies to enable machine-understandability of data and metadata, and establishing links between data and related research entities to provide their contextual information (e.g., devices and methods). To support FAIR data, PANGAEA is currently elaborating workflows to enrich instrument information of scientific datasets utilizing internal as well as third party services and ontologies and their identifiers. This abstract presents our ongoing development within the projects FREYA and FAIRsFAIR as follows:&lt;/p&gt;&lt;ul&gt;&lt;li&gt;Integrating the AWI O2A (Observations to Archives) framework and associated suite of tools within PANGAEA&amp;#8217;s curatorial workflow as well as semi-automatized ingestion of observatory data.&lt;/li&gt; &lt;li&gt;Linking data with their observation sources (devices) by recording the persistent identifiers (PID) from the O2A sensor registry system (sensor.awi.de) as part of the PANGAEA&amp;#160; instrumentation database.&lt;/li&gt; &lt;li&gt;Enriching device and method descriptions of scientific data by annotating them with appropriate vocabularies such as the NERC device type and device vocabularies or scientific methodology classifications.&lt;/li&gt; &lt;/ul&gt;&lt;p&gt;In our contribution we will also outline the challenges to be addressed in enabling FAIR vocabularies of instruments and methods. This includes questions regarding reliability and trustworthiness of third party ontologies and services. Further, challenges in content synchronisation across linked resources and implications on FAIRness levels of data sets such as dependencies on interlinked data sources and vocabularies.&lt;/p&gt;&lt;p&gt;We will show in how far adapting, harmonizing and controlling the used vocabularies, as well as identifier systems between data provider and data publisher, improves the findability and re-usability of datasets , while keeping the curational overhead a slow as possible. This use case is a valuable example of how improving interoperability through harmonization efforts, though initially problematic and labor intensive, can benefits to a multitude of stakeholders in the long run: data users, publishers, research institutes, and funders.&lt;/p&gt;

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Enabling Data Reuse Through Semantic Enrichment of Instrumentation

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Drivers and barriers towards re-using open government data (OGD): a case study of open data initiative in Oman
Stuti Saxena
foresight | VOL. 20
Stuti SaxenaStuti Saxena
09 Apr 2018
foresight | VOL. 20

Data compilations for enriched reuse of sea ice data sets
Anna Simson ... Julia Kowalski
-
Anna Simson, et. al.Anna Simson ... Julia Kowalski
15 May 2023
15 May 2023

Report on the First Workshop on Linking and Contextualizing Publications and Datasets
Paolo Manghi ... Jochen Schirrwagen
ACM SIGMOD Record | VOL. 43
Paolo Manghi, et. al.Paolo Manghi ... Jochen Schirrwagen
04 Dec 2014
ACM SIGMOD Record | VOL. 43

Mandatory submission of microarray data to public repositories: how is it working?
Beverly Ventura
Physiological Genomics | VOL. 20
Beverly VenturaBeverly Ventura
20 Jan 2005
Physiological Genomics | VOL. 20

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Enabling Data Reuse Through Semantic Enrichment of Instrumentation

Abstract

Talk to us

Similar Papers