Abstract
The recent technological advancements and emergence of the open data in environmental and life sciences are opening new research opportunities while creating new challenges around data management. They make available an unprecedented amount of data that can be exploited for studying complex phenomena. However, new challenges related to data management need to be addressed to ensure effective data sharing, discovery and reuse, especially when dealing with interdisciplinary research contexts. These issues are magnified in interdisciplinary context, by the fact that each discipline has its practices, e.g., specific formats and metadata standards. Moreover, the majority of current data management practices do not consider semantic heterogeneity existing among disciplines. For this reason, we introduce a flexible metadata model that describes the datasets of various disciplines using a common paradigm based on the observation concept. It provides a key vision for articulating the user point of view and underlying scientific domains. In this study, we therefore decide to mainly reuse the SOSA lightweight ontology (Sensor, Observation, Sample, and Actuator) to efficiently leverage others existing ontologies to improve datasets discovery and reuse coming from Earth and life observation. The main benefit of the proposed metadata model is that it extends the technical description, usually provided by existing metadata models, with the observation context description enabling the need of a user viewpoint. Moreover, following the FAIR principles, the metadata model specifies the semantics of its elements using ontologies and vocabularies, and reuses as much as possible ontological and terminological existing resources. We show the benefit and applicability of the model through a case study we identified as representative after interviewing researchers in environmental and life sciences.
Highlights
For tackling broader and complex questions about the natural world, nowadays scientists in environmental and life sciences can exploit the vast amount of data that is available through different platforms and services (Kelling et al, 2009), thanks to both the increasing advance ment in technologies and the advent of open science
In this case, when the scientist will publish its dataset on the web, he will describe “chlorophyll concentration” and “sea surface temperature” as features that have an impact of its object of the study
We suggest to uniformly represent datasets originating from different disciplines using a common description, which is based on the observation paradigm; more precisely, we suggest the exploitation of the SOSA observation model as a metadata model
Summary
For tackling broader and complex questions about the natural world, nowadays scientists in environmental and life sciences can exploit the vast amount of data that is available through different platforms and services (Kelling et al, 2009), thanks to both the increasing advance ment in technologies and the advent of open science. We provide a metadata model that, embodying a user-centric view point, satisfies the description needs of different communities; it characterizes a dataset based on multiple aspects that are associated with an observation (i.e. object of interest, observed property, collection protocol, spatial and temporal extents); these elements of high level of abstraction and shared and understood by the main part of the environmental community are used, simultaneously or not, for discovering and evaluating the relevance of a dataset depending on the focus of the disciplines involved in a study (usually different disciplines privilege different aspects for discovering and evaluating datasets);.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.