Abstract

Abstract. Large differences in instrumentation, site setup, data format, and operating system stymie the adoption of a universal computational environment for processing and analyzing eddy-covariance (EC) data. This results in limited software applicability and extensibility in addition to often substantial inconsistencies in flux estimates. Addressing these concerns, this paper presents the systematic development of portable, reproducible, and extensible EC software achieved by adopting a development and systems operation (DevOps) approach. This software development model is used for the creation of the eddy4R family of EC code packages in the open-source R language for statistical computing. These packages are community developed, iterated via the Git distributed version control system, and wrapped into a portable and reproducible Docker filesystem that is independent of the underlying host operating system. The HDF5 hierarchical data format then provides a streamlined mechanism for highly compressed and fully self-documented data ingest and output. The usefulness of the DevOps approach was evaluated for three test applications. First, the resultant EC processing software was used to analyze standard flux tower data from the first EC instruments installed at a National Ecological Observatory (NEON) field site. Second, through an aircraft test application, we demonstrate the modular extensibility of eddy4R to analyze EC data from other platforms. Third, an intercomparison with commercial-grade software showed excellent agreement (R2 = 1.0 for CO2 flux). In conjunction with this study, a Docker image containing the first two eddy4R packages and an executable example workflow, as well as first NEON EC data products are released publicly. We conclude by describing the work remaining to arrive at the automated generation of science-grade EC fluxes and benefits to the science community at large. This software development model is applicable beyond EC and more generally builds the capacity to deploy complex algorithms developed by scientists in an efficient and scalable manner. In addition, modularity permits meeting project milestones while retaining extensibility with time.

Highlights

  • Answering grand challenges in Earth system science and ecology requires combining information from hierarchies of environmental observations

  • It should be noted that the “turbulence”, “storage” and “combined” Docker containers (Fig. 5, right panels) are all spawned from the same eddy4R–Docker image (Fig. 5, center panel): each container includes the same underlying functionality, but serves a different purpose by being fed the “turbulence”, “storage” or “combined” workflow files. This eddy4R–Docker EC processing framework modularly integrates into preexisting data-processing pipelines, such as National Ecological Observatory Network (NEON)’s CI (Fig. 6): in NEON’s preexisting framework, the CI group encoded simple algorithms in Java, based on algorithm documentation provided by NEON Science staff

  • We present three test applications of eddy4R–Docker to evaluate whether the NEON development and systems operation (DevOps) model can produce collaborative, portable, reproducible, and extensible EC software

Read more

Summary

Introduction

Answering grand challenges in Earth system science and ecology requires combining information from hierarchies of environmental observations (tower, aircraft, satellite; Raupach et al, 2005; Running et al, 1999; Turner et al, 2004). It should be noted that the “turbulence”, “storage” and “combined” Docker containers (Fig. 5, right panels) are all spawned from the same eddy4R–Docker image (Fig. 5, center panel): each container includes the same underlying functionality (eddy4R packages), but serves a different purpose by being fed the “turbulence”, “storage” or “combined” workflow files This eddy4R–Docker EC processing framework modularly integrates into preexisting data-processing pipelines, such as NEON’s CI (Fig. 6): in NEON’s preexisting framework, the CI group encoded simple algorithms (e.g., temporal means) in Java, based on algorithm documentation provided by NEON Science staff. The key difference of the eddy4R–Docker EC processing framework is that instead of algorithm documentation, NEON Science staff provide documented algorithms that perform a complex series of processing steps, which can be directly deployed by CI Does this adoption of the NEON–DevOps workflow (Fig. 2) streamline end-to-end operational implementation and efficiency, it empowers the science community at www.geosci-model-dev.net/10/3189/2017/. End-user experience is monitored via the issues feature in GitHub, where users can report code bugs, deployment problems, etc

Installation and operation
Test applications
Tower eddy-covariance measurements
Algorithm settings and profiling
Results and discussion
Aircraft eddy-covariance measurements
Algorithm settings
Summary and conclusions
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call