Abstract

Abstract. Spatio-temporal fields of land–atmosphere fluxes derived from data-driven models can complement simulations by process-based land surface models. While a number of strategies for empirical models with eddy-covariance flux data have been applied, a systematic intercomparison of these methods has been missing so far. In this study, we performed a cross-validation experiment for predicting carbon dioxide, latent heat, sensible heat and net radiation fluxes across different ecosystem types with 11 machine learning (ML) methods from four different classes (kernel methods, neural networks, tree methods, and regression splines). We applied two complementary setups: (1) 8-day average fluxes based on remotely sensed data and (2) daily mean fluxes based on meteorological data and a mean seasonal cycle of remotely sensed variables. The patterns of predictions from different ML and experimental setups were highly consistent. There were systematic differences in performance among the fluxes, with the following ascending order: net ecosystem exchange (R2 < 0.5), ecosystem respiration (R2 > 0.6), gross primary production (R2> 0.7), latent heat (R2 > 0.7), sensible heat (R2 > 0.7), and net radiation (R2 > 0.8). The ML methods predicted the across-site variability and the mean seasonal cycle of the observed fluxes very well (R2 > 0.7), while the 8-day deviations from the mean seasonal cycle were not well predicted (R2 < 0.5). Fluxes were better predicted at forested and temperate climate sites than at sites in extreme climates or less represented by training data (e.g., the tropics). The evaluated large ensemble of ML-based models will be the basis of new global flux products.

Highlights

  • Improving our knowledge of the carbon, water, and energy exchanges between terrestrial ecosystems and the atmosphere is essential to better understand and model the Earth’s climate system (IPCC, 2007; Reich, 2010)

  • The increasing number of eddy-covariance sites across the globe has encouraged the application of data-driven models by machine learning (ML) methods such as artificial neural networks (ANNs, Papale and Valentini, 2003), random forest (RF, Tramontana et al, 2015), model trees ensemble (MTE, Jung et al, 2009; Xiao et al, 2008, 2010) or support vector regression (SVR, Yang et al, 2006, 2007) to estimate land surface–atmosphere fluxes from site level to regional or global scales (e.g., Beer et al, 2010, Jung et al, 2010, 2011; Kondo et al, 2015; Schwalm et al, 2010, 2012; Yang et al, 2007; Xiao et al, 2008, 2010)

  • We evaluated the overall predictive capacity and consistency of ML approaches – including the ML median estimate – by flux, by experimental setup and by site as well as grouped by Köppen climate zone and International Geosphere-Biosphere Programme (IGBP) plant functional types

Read more

Summary

Introduction

Improving our knowledge of the carbon, water, and energy exchanges between terrestrial ecosystems and the atmosphere is essential to better understand and model the Earth’s climate system (IPCC, 2007; Reich, 2010). The large-scale measurement network, FLUXNET, integrates site observations of these fluxes globally and provides detailed time series of carbon and energy fluxes across biomes and climates (Baldocchi, 2008). Eddy-covariance measurements are site-level observations (at < 1 km scale), and spatial upscaling is required to estimate these fluxes at regional to global scales. The ML upscaled outputs are increasingly used to evaluate process-based land surface models (e.g., Anav et al, 2013; Bonan et al, 2011; Ichii et al, 2009; Piao et al, 2013)

Methods
Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.