Abstract

In our project we are employing semantic data management with the Open Source research data management system (RDMS) CaosDB [1] to link empirical data and simulation output from Earth System Models [2]. The combined management of these data structures allows us to perform complex queries and facilitates the integration of data and meta data into data analysis workflows.One particular challenge for analyses of model output is to keep track of all necessary meta data of each simulation during the whole digital workflow. Especially for open science approaches it is of great importance to properly document - in human- and computer-readable form - all the information necessary to completely reproduce obtained results. Furthermore, we want to be able to feed all relevant data from data analysis back into our data management system, so that we are able to perform complex queries also on data sets and parameters stemming from data analysis workflows.A specific aim of this project is to re-analyse existing sets of simulations under different research questions. This endeavour can become very time consuming without proper documentation in an RDMS.We implemented a workflow, combining semantic research data management with CaosDB and Jupyter notebooks, that keeps track of data loaded into an analysis workspace. Procedures are provided that create snapshots of specific states of the analysis. These snapshots can automatically be interpreted by the CaosDB crawler that is able to insert and update records in the system accordingly. The snapshots include links to the input data, parameter information, the source code and results and therefore provide a high-level interface to the full chain of data processing, from empirical and simulated raw data to the results. For example, input parameters of complex Earth System Models can be extracted automatically and related to model performance. In our use case, not only automated analyses are feasible, but also interactive approaches are supported.[1] Fitschen, T.; Schlemmer, A.; Hornung, D.; tom Wörden, H.; Parlitz, U.; Luther, S. CaosDB—Research Data Management for Complex, Changing, and Automated Research Workflows. Data 2019, 4, 83. https://doi.org/10.3390/data4020083 [2] Schlemmer, A., Merder, J., Dittmar, T., Feudel, U., Blasius, B., Luther, S., Parlitz, U., Freund, J., and Lennartz, S. T.: Implementing semantic data management for bridging empirical and simulative approaches in marine biogeochemistry, EGU General Assembly 2022, Vienna, Austria, 23–27 May 2022, EGU22-11766, https://doi.org/10.5194/egusphere-egu22-11766, 2022.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call