Abstract
Abstract. Can we improve machine-learning (ML) emulators with synthetic data? If data are scarce or expensive to source and a physical model is available, statistically generated data may be useful for augmenting training sets cheaply. Here we explore the use of copula-based models for generating synthetically augmented datasets in weather and climate by testing the method on a toy physical model of downwelling longwave radiation and corresponding neural network emulator. Results show that for copula-augmented datasets, predictions are improved by up to 62 % for the mean absolute error (from 1.17 to 0.44 W m−2).
Highlights
The use of machine learning (ML) in weather and climate is becoming increasingly popular (Huntingford et al, 2019; Reichstein et al, 2019)
When it comes to training ML models for weather and climate applications two main strategies may be identified: one in which input and output pairs are directly provided and a second in which inputs are provided but corresponding outputs are generated through a physical model
The method is demonstrated with a toy model of downwelling radiation as the physical model (Sect. 2.4) and a simple feed-forward neural network (FNN) as the ML emulator (Sect. 2.5)
Summary
The use of machine learning (ML) in weather and climate is becoming increasingly popular (Huntingford et al, 2019; Reichstein et al, 2019). Krasnopolsky and Lin, 2012; Rasp and Lerch, 2018) When it comes to training ML models for weather and climate applications two main strategies may be identified: one in which input and output pairs are directly provided (e.g. both come from observations) and a second in which inputs are provided but corresponding outputs are generated through a physical model (e.g. parameterization schemes or even a whole weather and climate model). Given estimated models C and F1, . . ., Fn for the copula and marginal distributions, we can generate synthetic data as follows
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have