Ensuring reliable datasets for environmental models and forecasts

Emery R Boose,Aaron M Ellison,Leon J Osterweil,Lori A Clarke,Rodion Podorozhny,Julian L Hadley,Alexander Wise,David R Foster

doi:10.1016/j.ecoinf.2007.07.006

Abstract

At the dawn of the 21st century, environmental scientists are collecting more data more rapidly than at any time in the past. Nowhere is this change more evident than in the advent of sensor networks able to collect and process (in real time) simultaneous measurements over broad areas and at high sampling rates. At the same time there has been great progress in the development of standards, methods, and tools for data analysis and synthesis, including a new standard for descriptive metadata for ecological datasets (Ecological Metadata Language) and new workflow tools that help scientists to assemble datasets and to diagram, record, and execute analyses. However these developments (important as they are) are not yet sufficient to guarantee the reliability of datasets created by a scientific process — the complex activity that scientists carry out in order to create a dataset. We define a dataset to be reliable when the scientific process used to create it is (1) reproducible and (2) analyzable for potential defects. To address this problem we propose the use of an analytic web, a formal representation of a scientific process that consists of three coordinated graphs (a data-flow graph, a dataset-derivation graph, and a process-derivation graph) originally developed for use in software engineering. An analytic web meets the two key requirements for ensuring dataset reliability: (1) a complete audit trail of all artifacts (e.g., datasets, code, models) used or created in the execution of the scientific process that created the dataset, and (2) detailed process metadata that precisely describe all sub-processes of the scientific process. Construction of such metadata requires the semantic features of a high-level process definition language. In this paper we illustrate the use of an analytic web to represent the scientific process of constructing estimates of ecosystem water flux from data gathered by a complex, real-time multi-sensor network. We use Little-JIL, a high-level process definition language, to precisely and accurately capture the analytical processes involved. We believe that incorporation of this approach into existing tools and evolving metadata specifications (such as EML) will yield significant benefits to science. These benefits include: complete and accurate representations of scientific processes; support for rigorous evaluation of such processes for logical and statistical errors and for propagation of measurement error; and assurance of dataset reliability for developing sound models and forecasts of environmental change.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Ensuring reliable datasets for environmental models and forecasts

Abstract

Talk to us

Similar Papers

More From: Ecological Informatics

Lead the way for us

Journal: Ecological Informatics	Publication Date: Oct 1, 2007
Citations: 59

Similar Papers

ANALYTIC WEBS SUPPORT THE SYNTHESIS OF ECOLOGICAL DATA SETS
Aaron M Ellison ... Howard Schultz
Ecology | VOL. 87
Aaron M Ellison, et. al.Aaron M Ellison ... Howard Schultz
01 Jun 2006
Ecology | VOL. 87

Mission: educational.
John Manuel
Environmental health perspectives | VOL. 112
John ManuelJohn Manuel
01 Oct 2004
Environmental health perspectives | VOL. 112

<title>Architectural evaluation of beam-steered shuffle optical interconnect</title>
Miles J Murdocca ... Michael Dennison
-
Miles J Murdocca, et. al.Miles J Murdocca ... Michael Dennison
12 Jun 1996
12 Jun 1996

A Unique Marine and Environmental Science Program for High School Teachers in Hawai‘i: Professional Development, Teacher Confidence, and Lessons Learned
Malia Ana J Rivera ... David A Krupp
International Journal of Environmental and Science Education | VOL. 8
Malia Ana J Rivera, et. al.Malia Ana J Rivera ... David A Krupp
01 Apr 2013
International Journal of Environmental and Science Education | VOL. 8

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Ensuring reliable datasets for environmental models and forecasts

Abstract

Talk to us

Similar Papers

More From: Ecological Informatics