Abstract

Based on geometric invariance properties, we derive an explicit prior distribution for the parameters of multivariate linear regression problems in the absence of further prior information. The problem is formulated as a rotationally-invariant distribution of \(L\)-dimensional hyperplanes in \(N\) dimensions, and the associated system of partial differential equations is solved. The derived prior distribution generalizes the already known special cases, e.g., 2D plane in three dimensions.

Highlights

  • In the context of Bayesian probability theory, a proper assignment of prior probabilities is crucial.Depending on the domain, quite different prior information can be available

  • Quite different prior information can be available. It may be in the form of point estimates provided by domain experts or in the form of invariances of the system of interest, which should be reflected in the prior probability density [2]

  • On the left-hand side, 15 random samples generated from this prior distribution with a ∈ [0, 50] are displayed. Confronted with this result, the typical response is that instead, a more “uniform” prior distribution of the slopes was intended, which is often depicted like in Figure 1 on the right-hand side. This plot was generated from a prior distribution that has an equal probability density for the angle of the line to the abscissa, corresponding to p (a | I) ∼

Read more

Summary

Introduction

In the context of Bayesian probability theory, a proper assignment of prior probabilities is crucial. In practice, the units of the axes are commonly chosen in such a way that extreme values of the slopes are not a priori overrepresented If we generalize this requirement to more than one independent or dependent variable, the desired prior probability should be invariant under arbitrary rotations in this parameter space. These special cases have since been generalized to invariant priors for (N − 1)-dimensional hyperplanes in N -dimensional space; see, e.g., [4] These hyperplane priors proved to be valuable for Bayesian neural networks [5], where the specific properties of the prior density favored node-pruning instead of simple edge pruning of standard (quadratic) weight regularizers. The assumption of rotation invariance may not be suitable for covariates with different underlying units (e.g., m2 , kg)

Problem Statement
Invariance under Translations
Invariance under Rotations
Rotation in the xi xj -Plane
The PDE System
Solution
Preliminaries
Matrices with Either Row j or Column i
Matrices with Neither Row j nor Column i
Relation to Previously-Derived Special Cases
Practical Hints
Conclusions
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call