Abstract

The recent technological advancements and emergence of the open data in environmental and life sciences are opening new research opportunities while creating new challenges around data management. They make available an unprecedented amount of data that can be exploited for studying complex phenomena. However, new challenges related to data management need to be addressed to ensure effective data sharing, discovery and reuse, especially when dealing with interdisciplinary research contexts. These issues are magnified in interdisciplinary context, by the fact that each discipline has its practices, e.g., specific formats and metadata standards. Moreover, the majority of current data management practices do not consider semantic heterogeneity existing among disciplines. For this reason, we introduce a flexible metadata model that describes the datasets of various disciplines using a common paradigm based on the observation concept. It provides a key vision for articulating the user point of view and underlying scientific domains. In this study, we therefore decide to mainly reuse the SOSA lightweight ontology (Sensor, Observation, Sample, and Actuator) to efficiently leverage others existing ontologies to improve datasets discovery and reuse coming from Earth and life observation. The main benefit of the proposed metadata model is that it extends the technical description, usually provided by existing metadata models, with the observation context description enabling the need of a user viewpoint. Moreover, following the FAIR principles, the metadata model specifies the semantics of its elements using ontologies and vocabularies, and reuses as much as possible ontological and terminological existing resources. We show the benefit and applicability of the model through a case study we identified as representative after interviewing researchers in environmental and life sciences.

Highlights

  • For tackling broader and complex questions about the natural world, nowadays scientists in environmental and life sciences can exploit the vast amount of data that is available through different platforms and services (Kelling et al, 2009), thanks to both the increasing advance­ ment in technologies and the advent of open science

  • In this case, when the scientist will publish its dataset on the web, he will describe “chlorophyll concentration” and “sea surface temperature” as features that have an impact of its object of the study

  • We suggest to uniformly represent datasets originating from different disciplines using a common description, which is based on the observation paradigm; more precisely, we suggest the exploitation of the SOSA observation model as a metadata model

Read more

Summary

Introduction

For tackling broader and complex questions about the natural world, nowadays scientists in environmental and life sciences can exploit the vast amount of data that is available through different platforms and services (Kelling et al, 2009), thanks to both the increasing advance­ ment in technologies and the advent of open science. We provide a metadata model that, embodying a user-centric view­ point, satisfies the description needs of different communities; it characterizes a dataset based on multiple aspects that are associated with an observation (i.e. object of interest, observed property, collection protocol, spatial and temporal extents); these elements of high level of abstraction and shared and understood by the main part of the environmental community are used, simultaneously or not, for discovering and evaluating the relevance of a dataset depending on the focus of the disciplines involved in a study (usually different disciplines privilege different aspects for discovering and evaluating datasets);.

Related works
A user-centric metadata model for facilitating interdisciplinary research
Proposed metadata model
Using the FOI and UFOI concepts to enable domain-neutral dataset searches
Adding spatial and temporal dataset granularity for discovery process
A practical example
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call