Data assimilation plays an essential role in real-time forecasting but demands repetitive model evaluations given ensembles. To address this computational challenge, a novel, robust and efficient approach to surrogate data assimilation is presented. It replaces the internal processes of the ensemble Kalman filter (EnKF) with polynomial chaos expansion (PCE) theory. Eight types of surrogate filters, which can be characterized according to their different surrogate structures, building systems, and assimilating targets, are proposed and validated. To compensate for the potential shortcomings of the existing sequential experimental design (SED), an advanced optimization scheme, named sequential experimental design-polynomial degree (SED-PD), is also advised. Its dual optimization system resolves the issue of SED by which the value of the polynomial degree had to be selected ad-hoc or by trial and error; its multiple stopping criteria ensure convergence even when an accuracy metric does not monotonically decrease over iterations. A comprehensive investigation into how to configure a surrogate filter indicates that the new partial (replacing part of original filters) and invariant (valid for entire time periods) approaches are preferred in terms of accuracy and efficiency, which helps directly reduce the number of dimensions and bridge the gap between hindcasting and real-time forecasting. Of the eight filters, the Dual Invariant Partial filter performs best, with equivalent accuracy to Dual EnKF and about 500 times greater computational efficiency. Ultimately, this proposed surrogate filter will be a promising alternative tool for performing computationally-intensive data assimilation in high-dimensional problems.
Read full abstract