A non-Gaussian analysis scheme using rank histograms for ensemble data assimilation

S. Metref,C. Snyder,E. Cosme,P. Brasseur

doi:10.5194/npg-21-869-2014

Abstract

Abstract. One challenge of geophysical data assimilation is to address the issue of non-Gaussianities in the distributions of the physical variables ensuing, in many cases, from nonlinear dynamical models. Non-Gaussian ensemble analysis methods fall into two categories, those remapping the ensemble particles by approximating the best linear unbiased estimate, for example, the ensemble Kalman filter (EnKF), and those resampling the particles by directly applying Bayes' rule, like particle filters. In this article, it is suggested that the most common remapping methods can only handle weakly non-Gaussian distributions, while the others suffer from sampling issues. In between those two categories, a new remapping method directly applying Bayes' rule, the multivariate rank histogram filter (MRHF), is introduced as an extension of the rank histogram filter (RHF) first introduced by Anderson (2010). Its performance is evaluated and compared with several data assimilation methods, on different levels of non-Gaussianity with the Lorenz 63 model. The method's behavior is then illustrated on a simple density estimation problem using ensemble simulations from a coupled physical–biogeochemical model of the North Atlantic ocean. The MRHF performs well with low-dimensional systems in strongly non-Gaussian regimes.

Highlights

The principal goal of data assimilation is to estimate the state of a dynamical system, based on prior information and a time series of observations, while calculating probabilistic measures corresponding to the accuracy of this estimation
In a medium nonlinear case ( t = 0.25, central panel of Fig. 8) and for Ne ≥ 32, both the multivariate rank histogram filter (MRHF) in its full formulation and the mean-field approximated MRHF produce a smaller root mean square error (RMSE) than the ensemble Kalman filter (EnKF) and the rank histogram filter (RHF)
The KL divergence of the EnKF and RHF does not depend on the ensemble size, confirming that these methods are not designed to deal with such multimodal problems

Summary

Introduction

The principal goal of data assimilation is to estimate the state of a dynamical system, based on prior information and a time series of observations, while calculating probabilistic measures corresponding to the accuracy of this estimation. They transform the prior ensemble into a posterior ensemble, using a function that is optimal (an optimal map, Cotter and Reich, 2013) under the assumption of Gaussianity of the prior ensemble and the observation errors. Those methods are applicable to high-dimensional systems in meteorology (Whitaker et al, 2008; Buehner et al, 2010) and oceanography (Lermusiaux, 2006; Sakov et al, 2012). EnKF-based methods remain sensitive to the violation of the Gaussian assumption (Lawson and Hansen, 2004; Lei et al, 2010) and may lead to unwanted phenomena such as inaccurate estimations, failure to respect nonlinear physical balances, or more dramatically to instability of the filter

Objectives

Methods

Results

Conclusion