Abstract
AbstractEnsemble Kalman Filters are used extensively in all geoscience areas. Often a stochastic variant is used, in which each ensemble member is updated via the Kalman Filter equation with an extra perturbation in the innovation. These perturbations are essential for the correct ensemble spread in a stochastic Ensemble Kalman Filter, and are applied either to the observations or to the modelled observations. This paper investigates if there is a preference for either of these two perturbation methods. Both versions lead to the same posterior mean and covariance when the prior and the likelihood are Gaussian in the state. However, ensemble verification methods, Bayes' Theorem and the Best Linear Unbiased Estimate (BLUE) suggest that one should perturb the modelled observations. Furthermore, it is known that in non‐Gaussian settings the perturbed modelled observation scheme is preferred, illustrated here for a skewed likelihood. Existing reasons for the perturbed observation scheme are shown to be incorrect, and no new arguments in favour of that scheme have been found. Finally, a new and consistent derivation and interpretation of the stochastic version of the EnKF equations is derived based on perturbing modelled observations. It is argued that these results have direct consequences for (iterative) Ensemble Kalman Filters and Smoothers, including “perturbed observation” 3D‐ and 4D‐Vars, both in terms of internal consistency and implementation.
Highlights
A major breakthrough in data assimilation and Bayesian Inference for high-dimensional systems was the introduction of the Ensemble Kalman Filter (EnKF) by Evensen
The scheme uses an ensemble representation of the probability density function of the state of the system and the propagation of the pdf in time is represented by the propagation of the ensemble members with the full model equations
It is expected that perturbing modelled observations instead of perturbing observations is beneficial for non-Gaussian situations where iterative variants of the update scheme are used, such as stochastic iterative ensemble filters and smoothers and the “Ensemble of Data Assimilations” scheme (e.g., Žagar et al, 2005 and Isaksen et al, 2011) as these can be seen as Gauss–Newton iterations of the linear scheme discussed here
Summary
A major breakthrough in data assimilation and Bayesian Inference for high-dimensional systems was the introduction of the Ensemble Kalman Filter (EnKF) by Evensen (1994). Burgers et al (1998) argue that the missing term is due to the fact that each ensemble member is updated with the same observation, leading them to suggest a perturbed observation scheme. Since εi and the prior ensemble members are independent draws from independent distributions, this would lead to the correct covariance update in the limit of an infinite ensemble size: xia − xa = xif + K(yo + εi − Hxif ) − (I − KH)xf − Kyo. Repeating the analysis for the perturbed observation scheme given above with the sign of ε changed does show that this scheme leads to the same posterior statistics This version of the stochastic EnKF has been used more and more (e.g., the review on ensemble methods in Vetra-Carvalho et al, 2018). To keep the paper focussed, we will not discuss issues like inbreeding, localisation and inflation, or efficient updating schemes as they are not directly relevant for the present discussion
Accepted Version (Free)
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have