Abstract

Abstract. Large computer models are ubiquitous in the Earth sciences. These models often have tens or hundreds of tuneable parameters and can take thousands of core hours to run to completion while generating terabytes of output. It is becoming common practice to develop emulators as fast approximations, or surrogates, of these models in order to explore the relationships between these inputs and outputs, understand uncertainties, and generate large ensembles datasets. While the purpose of these surrogates may differ, their development is often very similar. Here we introduce ESEm: an open-source tool providing a general workflow for emulating and validating a wide variety of models and outputs. It includes efficient routines for sampling these emulators for the purpose of uncertainty quantification and model calibration. It is built on well-established, high-performance libraries to ensure robustness, extensibility and scalability. We demonstrate the flexibility of ESEm through three case studies using ESEm to reduce parametric uncertainty in a general circulation model and explore precipitation sensitivity in a cloud-resolving model and scenario uncertainty in the CMIP6 multi-model ensemble.

Highlights

  • Computer models are crucial tools for their diagnostic and predictive power and are applied to every aspect of the Earth sciences

  • GPFlow builds on the heritage of the GPy library (GPy, 2012) but is based on the TensorFlow (Abadi et al, 2016) machine learning library with out-of-the-box support for the use of graphical processing units (GPUs), which can considerably speed up the training of Gaussian processes (GPs)

  • While approximate Bayesian computation (ABC) and Markov chain Monte Carlo (MCMC) form the backbone of many parameter estimation techniques, there has been a large amount of research on improved techniques, for complex simulators with high-dimensional outputs

Read more

Summary

Introduction

Computer models are crucial tools for their diagnostic and predictive power and are applied to every aspect of the Earth sciences. Despite initial difficulties with their scalability as compared to, e.g. neural networks, recent advances have allowed for deeper, more expressive (Damianou and Lawrence, 2013) GPs that can be trained on ever larger volumes of training data (Burt et al, 2019) Despite their prevalent use in other areas of machine learning, CNNs and RFs have not been widely used in model emulation. ESEm provides a number of options for performing this inference, from simple rejection sampling to more complex Markov chain Monte Carlo (MCMC) techniques Despite their increasing popularity, no general-purpose toolset exists for model emulation in the Earth sciences. In this paper we aim to describe the ESEm tool and to elucidate the general process of emulation with a number of distinct examples, including model calibration, in the hope of demonstrating its usefulness to the field.

Exemplar problem
Emulation engines
Input data preparation
Gaussian process engine
Neural network engine
Random forests
Calibration
Approximate Bayesian computation
Extensions
Cloud-resolving model sensitivity
Exploring CMIP6 scenario uncertainty
Findings
Conclusions

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.