Synthetic data for reef modelling

Rose Crocker,Barbara J Robson,Chinenye Ani,Ken Anthony,Takuya Iwanaga

doi:10.1016/j.ecoinf.2024.102698

Rose Crocker, Barbara J Robson + Show 3 more

Open Access

https://doi.org/10.1016/j.ecoinf.2024.102698

Copy DOI

Export

Save

Cite

Journal: Ecological Informatics	Publication Date: Jun 20, 2024
License type: cc-by-nc-nd

Abstract
Full-Text
Similar Papers

Abstract

Listen

Synthetic data mimics the statistical properties of real-world datasets while removing reference to sensitive or confidential information in the original dataset (Quintana, 2020). Synthetic data is also useful for general model testing and development, with many methods available for generating data from machine learning models (Raghunathan, 2021). Although not widely used in the context of ecological and environmental modelling, synthetic data can support and accelerate model testing and analyses where rightsholders are sensitive to data disclosure for study areas, or data collection is expensive.In the context of reef modelling, synthetic data can be used to support model analyses that can be published without referring to specific sites, reefs, or study areas. This is desirable in the context of decision support for restoration of the Great Barrier Reef. The Reef has many stakeholders and release of early modelling results for intervention scenarios for specific areas would be premature until management or intervention strategy options have been discussed with stakeholders and/or rightsholders. Synthetic data allows a path to publish model and method demonstrations to share knowledge with the reef decision support community without prematurely suggesting policy recommendations for reefs which are sensitive to rightsholders or stakeholders.We showcase a synthetic data pipeline developed for the reef decision-support system ADRIA (Adaptive Dynamic Reef Intervention Algorithms), using methods from the Python package Synthetic Data Vault (Patki et al., 2016) and others. The synthetic data models are developed to emulate the statistics of case-study reefs for publishing decision-support tool demonstrations, testing and method validation without revealing sensitive reef site information. This pipeline includes developing models for tabular (benthic/compositional reef data), spatial-temporal (wave and heat stress data) and spatial network data (coral larval connectivity). Conditional sampling methods which connect spatial relationships across datasets are used to develop synthetic reef data packages which mimic the statistical properties of the original dataset. The utility of the synthetic data is demonstrated on a sample reef data package, and methods used for anonymizing the data are detailed. The results are discussed in the context of formalizing synthetic data for reef modelling. All synthetic data code is available at ADRIA-synthetic-data/README.md at v0.1.0 · open-AIMS/ADRIA-synthetic-data (github.com), DOI: https://doi.org/10.5281/zenodo.10158323.

Full Text

Published Version

View

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

Synthetic data for reef modelling

Abstract

Published Version

Talk to us

Similar Papers

More From: Ecological Informatics

Lead the way for us

Similar Papers

Enhancing biomechanical machine learning with limited data: generating realistic synthetic posture data using generative artificial intelligence.
Carlo Dindorf ... Frederike Werthmann
Frontiers in bioengineering and biotechnology | VOL. 12
Carlo Dindorf, et. al.Carlo Dindorf ... Frederike Werthmann
14 Feb 2024
Frontiers in bioengineering and biotechnology | VOL. 12

#5490 GENERATIVE ARTIFICIAL INTELLIGENCE FOR CREATION OF SYNTHETIC HYPERTENSION TRIAL DATA
Chirag Jain ... Conor Judge
Nephrology Dialysis Transplantation | VOL. 38
Chirag Jain, et. al.Chirag Jain ... Conor Judge
14 Jun 2023
#5490 GENERATIVE ARTIFICIAL INTELLIGENCE FOR CREATION OF SYNTHETIC HYPERTENSION TRIAL DATA
Chirag Jain ... Conor Judge

Computational Study Protocol: Leveraging Synthetic Data to Validate a Benchmark Study for Differential Abundance Tests for 16S Microbiome Sequencing Data.
Eva Kohnert ... Clemens Kreutz
F1000Research | VOL. 13
Eva Kohnert, et. al.Eva Kohnert ... Clemens Kreutz
01 Jan 2024
F1000Research | VOL. 13

GLSTM: A novel approach for prediction of real & synthetic PID diabetes data using GANs and LSTM classification model
Priyanka Gupta ... Sushma Jaiswal
International Journal of Experimental Research and Review | VOL. 30
Priyanka Gupta, et. al.Priyanka Gupta ... Sushma Jaiswal
30 Apr 2023
International Journal of Experimental Research and Review | VOL. 30

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

Synthetic data for reef modelling

Abstract

Published Version

Talk to us

Similar Papers

More From: Ecological Informatics