Harmonizing heterogeneous multi-proxy data from lake systems

Gregor Pfalz,Bernhard Diekmann,Johann-Christoph Freytag,Boris K Biskaborn

doi:10.1016/j.cageo.2021.104791

Abstract

When performing spatial-temporal investigations of multiple lake systems, geoscientists face the challenge of dealing with complex and heterogeneous data of different types, structure, and format. To support comparability, it is necessary to transform such data into a uniform format that ensures syntactic and semantic comparability. This paper presents a data science approach for transforming research data from different lake sediment cores into a coherent framework. For this purpose, we collected published and unpublished data from paleolimnological investigations of Arctic lake systems. Our approach adapted methods from the database field, such as developing entity-relationship (ER) diagrams, to understand the conceptual structure of the data independently of the source. We demonstrated the feasibility of our approach by transforming our ER diagram into a database schema for PostgreSQL, a popular database management system (DBMS). We validated our approach by conducting a comparative analysis on a set of acquired data, hereby focusing on the comparison of total organic carbon and bromine content in eight selected sediment cores. Still, we encountered serious obstacles in the development of the ER model. Heterogeneous structures within collected data made an automatic data integration impossible. Additionally, we realized that missing error information hampers the development of a conceptual model. Despite the strong initial heterogeneity of the original data, our harmonized dataset leads to comparable datasets, enabling numerical inter-proxy and inter-lake comparison.

Highlights

On-going global warming impacts Arctic landscapes through the “Arctic amplification” effect, where temperatures in the Arctic exceed the average Northern Hemisphere surface air temperature change (Bis kaborn et al, 2019b; IPCC, 2014; Miller et al, 2010)
We present a conceptual integration approach to enable a comprehensive comparison of datasets of varying quality from labo ratory analysis of lake sediment
For the cleansing and integration process presented in this paper, we used a collection of published and unpublished laboratory data and corresponding metadata from lake sediment cores

Summary

Introduction

On-going global warming impacts Arctic landscapes through the “Arctic amplification” effect, where temperatures in the Arctic exceed the average Northern Hemisphere surface air temperature change (Bis kaborn et al, 2019b; IPCC, 2014; Miller et al, 2010). While scientists continue to collect new data from lake systems each year, thorough data handling of already existing datasets might help to fill remaining knowledge gaps of past changes The quality of these older datasets varies depending on different factors, such as date of creation, individual project goals, available laboratory resources, and personnel bias (Cai and Zhu, 2015; Heidorn, 2008; Wang et al, 2001). When integrating these existing datasets into a coherent framework and reporting standard, we can work with higher reliability and reproduc ibility enabling large-scale synthesis studies

Objectives

Methods

Results

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Computers & Geosciences	Publication Date: Apr 24, 2021
Citations: 6	License type: cc-by

R Discovery Prime

R Discovery Prime

Harmonizing heterogeneous multi-proxy data from lake systems

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Computers & Geosciences

Lead the way for us

Similar Papers

Harmonizing heterogeneous multi-proxy data from Arctic lake sediment records 
Gregor Pfalz ... Boris K Biskaborn
-
Gregor Pfalz, et. al.Gregor Pfalz ... Boris K Biskaborn
04 Mar 2021
04 Mar 2021

Proposed framework for automatic grading system of ER diagram
Humasak Simanjuntak
-
Humasak SimanjuntakHumasak Simanjuntak
01 Oct 2015
01 Oct 2015

Development of recent chronologies and evaluation of temporal variations in Pb fluxes and sources in lake sediment and peat cores in a remote, highly radiogenic environment, Cairngorm Mountains, Scottish Highlands
John G Farmer ... Alexander Kirika
Geochimica et Cosmochimica Acta | VOL. 156
John G Farmer, et. al.John G Farmer ... Alexander Kirika
11 Feb 2015
Geochimica et Cosmochimica Acta | VOL. 156

Modifying the Entity relationship modelling notation: towards high quality relational databases from better notated ER models

-

22 Nov 2017
22 Nov 2017

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Harmonizing heterogeneous multi-proxy data from lake systems

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Computers &amp; Geosciences

More From: Computers & Geosciences