Abstract

Efficiently accessing the information contained in non-linear and high dimensional probability distributions remains a core challenge in modern statistics. Traditionally, estimators that go beyond point estimates are either categorized as Variational Inference (VI) or Markov-Chain Monte-Carlo (MCMC) techniques. While MCMC methods that utilize the geometric properties of continuous probability distributions to increase their efficiency have been proposed, VI methods rarely use the geometry. This work aims to fill this gap and proposes geometric Variational Inference (geoVI), a method based on Riemannian geometry and the Fisher information metric. It is used to construct a coordinate transformation that relates the Riemannian manifold associated with the metric to Euclidean space. The distribution, expressed in the coordinate system induced by the transformation, takes a particularly simple form that allows for an accurate variational approximation by a normal distribution. Furthermore, the algorithmic structure allows for an efficient implementation of geoVI which is demonstrated on multiple examples, ranging from low-dimensional illustrative ones to non-linear, hierarchical Bayesian inverse problems in thousands of dimensions.

Highlights

  • Accessing the information contained in non-linear and high dimensional probability distributions remains a core challenge in modern statistics

  • Markov Chain Monte-Carlo (MCMC) methods have been improved by incorporating geometric information of the posterior, especially by means of Riemannian manifold Hamilton MonteCarlo (RMHMC) [11], a particular hybrid Monte-Carlo (HMC) [12,13] technique that constructs a Hamiltonian system on a Riemannian manifold with a metric tensor related to the Fisher information metric of the likelihood distribution and the curvature of the prior

  • Overall we find that the geometric Variational Inference (geoVI) solution is slightly closer to the ground truth compared to Metric Gaussian variational inference (MGVI) and the posterior uncertainty is smaller for geoVI in most regions, with the exception of the unobserved region, where it is larger compared to MGVI

Read more

Summary

Geometric Properties of Posterior Distributions

In order to access the information contained in the posterior distribution P(ξ |d), in this work, we wish to exploit the geometric properties of the posterior, in particular with the help of Riemannian geometry. Positive definite, measure for the curvature can be obtained by replacing the Hessian of the likelihood with its Fisher information metric [18], defined as. In some way, we may regard M as the measure for the curvature in case the observed data d is unknown, and the only information given is the structure of the model itself, as encoded in P(d|ξ ) This connection is only of qualitative nature, but it highlights a key limitation of M when used as the defining property of the posterior geometry. It is noteworthy that attempts have been provided to resolve this issue via a more direct approach to recover a positive definite matrix from the Hessian of the posterior while retaining the local information of the data. We rely on the metric M as a measure for the curvature of the posterior, and leave possible extensions to future research

Coordinate Transformation
Basic Properties
Posterior Approximation
Direct Approximation
Numerical Approximation to Sampling
Properties
Numerical Sampling within geoVI
MGVI as a First Order Approximation
Examples
Applications
Gaussian Processes with Unknown Power Spectra
Log-normal Process with Noise Estimation
Separation of Diffuse Emission from Point Sources
Further Properties and Challenges
RMHMC with Metric Approximation
Pathological Cases
Summary and Outlook
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call