Abstract

The goal of statistics is to draw sensible conclusions from data. In mathematical statistics, observed data is assumed to be generated according to some unknown probability distribution. The aim is to find the unknown probability distribution using the available observations. In parametric statistics this is typically done by considering a finite-dimensional parametric family of probability distributions and estimating a parameter using the data. On the other hand, in non-parametric statistics one deals with infinite dimensional statistical models. The model is then described by some non-parametric parameter such as a probability distribution or a regression function. In Bayesian statistics one makes inference by choosing a probability distribution on the statistical model. We distinguish between the prior distribution and the posterior distribution. These distributions represent the statistician’s belief about the parameter before and after the data has become available. In the frequentist’s setup however, the parameter is assumed to have some true value. An asymptotic analysis is then possible by considering the posterior measure of shrinking neighborhoods around the true parameter as the number of observations increases. We are interested in how fast the posterior concentrates around the true parameter. In this thesis we consider two examples of a conditionally Gaussian process for the construction of a prior distribution on certain statistical models indexed by a function. The two examples that we consider are defined by choosing the paths of the process to be either tensor-product spline functions or location-scale kernel mixtures. The use of log-spline models and kernel mixtures to construct priors on probability densities is well-established in Bayesian non-parametrics. The use of Gaussian priors provides a unified approach to obtain rates of posterior contraction in various statistical settings. We consider density estimation, classification and fixed design regression settings. If the true function is a function of d variables with smoothness level ?? in the sense of H¨older, then the optimal rate of posterior contraction is of the order n-d+2?? if n is the number of observations. We show that it is possible to construct Gaussian priors from either the spline functions or the kernel mixtures which actually achieve posterior contraction at a near optimal rate. These priors will however depend on ??, an unknown characteristic of the function to be estimated. We show that in both cases it is possible to define a new procedure, based on these Gaussian priors, which also achieves a near optimal rate of posterior contraction, but which itself does not depend on the level of smoothness of the function of interest. This procedure thus adapts to the smoothness level. In the last chapter of this thesis, we focus on posterior contraction in the setting of fixed design regression with Gaussian errors. In this setting, the variance of the errors is a finite dimensional nuisance parameter which we can equip with a prior as well. The posterior contraction results imply in particular the concentration of posterior mass around this finite dimensional parameter at a non-parametric rate. We however know that posterior contraction in the finite-dimensional parameter case is typically faster: the optimal rate is n-1/2. We show via a semi-parametric Bernstein-von Mises result that it is possible to achieve posterior contraction around the finite dimensional parameter at rate n-1/2 if we equip the infinite dimensional parameter, the regression function f, as before with a Gaussian prior distribution.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call