Abstract

In this paper, we consider a surrogate modeling approach using a data-driven nonparametric likelihood function constructed on a manifold on which the data lie (or to which they are close). The proposed method represents the likelihood function using a spectral expansion formulation known as the kernel embedding of the conditional distribution. To respect the geometry of the data, we employ this spectral expansion using a set of data-driven basis functions obtained from the diffusion maps algorithm. The theoretical error estimate suggests that the error bound of the approximate data-driven likelihood function is independent of the variance of the basis functions, which allows us to determine the amount of training data for accurate likelihood function estimations. Supporting numerical results to demonstrate the robustness of the data-driven likelihood functions for parameter estimation are given on instructive examples involving stochastic and deterministic differential equations. When the dimension of the data manifold is strictly less than the dimension of the ambient space, we found that the proposed approach (which does not require the knowledge of the data manifold) is superior compared to likelihood functions constructed using standard parametric basis functions defined on the ambient coordinates. In an example where the data manifold is not smooth and unknown, the proposed method is more robust compared to an existing polynomial chaos surrogate model which assumes a parametric likelihood, the non-intrusive spectral projection. In fact, the estimation accuracy is comparable to direct MCMC estimates with only eight likelihood function evaluations that can be done offline as opposed to 4000 sequential function evaluations, whenever direct MCMC can be performed. A robust accurate estimation is also found using a likelihood function trained on statistical averages of the chaotic 40-dimensional Lorenz-96 model on a wide parameter domain.

Highlights

  • Bayesian inference is a popular approach for solving inverse problems with far-reaching applications, such as parameter estimation and uncertainty quantification.In this article, we will focus on a classical Bayesian inference problem of estimating the conditional distribution of hidden parameters of dynamical systems from a given set of noisy observations.Entropy 2019, 21, 559; doi:10.3390/e21060559 www.mdpi.com/journal/entropyIn particular, let x(t; θ) be a time-dependent state variable, which implicitly depends on the parameter θ through the following initial value problem,ẋ = f (x, θ), x(0) = x0 . (1)Here, for any fixed θ, f can be either deterministic or stochastic

  • We have developed a framework of a parameter estimation approach where Markov Chain Monte Carlo (MCMC) was employed with a nonparametric likelihood function

  • Our approach approximated the likelihood function using the kernel embedding of conditional distribution formulation based on Reproducing Kernel Weighted Hilbert Space (RKWHS)

Read more

Summary

Introduction

Bayesian inference is a popular approach for solving inverse problems with far-reaching applications, such as parameter estimation and uncertainty quantification (see for example [1,2,3]). Our approach respects the geometry of the data manifold Using this nonparametric likelihood function, we generate the MCMC chain for estimating the conditional distribution of hidden parameters. In one of the examples where the dynamical model is low-dimensional and the observation is in the form of (5), we compare the proposed approach with the direct MCMC and non-intrusive spectral projection method (both schemes use likelihood of the form (6)). We will demonstrate the robustness of the proposed approach on an example where g is a statistical average of a long-time trajectory (in which the likelihood is intractable) and the dynamical model has relatively high-dimensional chaotic dynamics such that repetitive evaluation of (4) is numerically expensive. We accompany this paper with Appendices for treating large amount of data and more numerical results

Conditional Density Estimation via Reproducing Kernel Weighted Hilbert Spaces
Review of Nonparametric RKWHS Representation of Conditional Density Functions
Error Estimation
Error Estimation Using Arbitrary Bases
Error Estimation Using a Data-Driven Hilbert Space
Basis Functions
Analytic Basis Functions
Data-Driven Basis Functions
Learning the Data-Driven Basis Functions
Nyström Extension
Parameter Estimation Using the Metropolis Scheme
Example I
Example II
Example III
Example IV
Conclusions
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call