Combining a relaxed EM algorithm with Occam’s razor for Bayesian variable selection in high-dimensional regression

Pierre Latouche,Julien Chiquet,Pierre-Alexandre Mattei,Charles Bouveyron

doi:10.1016/j.jmva.2015.09.004

Abstract

We address the problem of Bayesian variable selection for high-dimensional linear regression. We consider a generative model that uses a spike-and-slab-like prior distribution obtained by multiplying a deterministic binary vector, which traduces the sparsity of the problem, with a random Gaussian parameter vector. The originality of the work is to consider inference through relaxing the model and using a type-II log-likelihood maximization based on an EM algorithm. Model selection is performed afterwards relying on Occam’s razor and on a path of models found by the EM algorithm. Numerical comparisons between our method, called spinyReg, and state-of-the-art high-dimensional variable selection algorithms (such as lasso, adaptive lasso, stability selection or spike-and-slab procedures) are reported. Competitive variable selection results and predictive performances are achieved on both simulated and real benchmark data sets. An original regression data set involving the prediction of the number of visitors of the Orsay museum in Paris using bike-sharing system data is also introduced, illustrating the efficiency of the proposed approach. The R package spinyReg implementing the method proposed in this paper is available on CRAN.

Highlights

IntroductionParsimony has emerged as a very natural way to deal with highdimensional data spaces (Candes, 2014)
Over the past decades, parsimony has emerged as a very natural way to deal with highdimensional data spaces (Candes, 2014)
We considered the problem of Bayesian variable selection for high-dimensional linear regression through a sparse generative model

Summary

Introduction

Parsimony has emerged as a very natural way to deal with highdimensional data spaces (Candes, 2014). In the context of linear regression, finding a parsimonious parameter vector can both prevent overfitting, make an ill-posed problem (such as a “large p, small n” situation) tractable, and allow to interpret the data by finding which predictors are relevant. The problem of finding such predictors is referred to as sparse regression or variable selection and has mainly been considered either by likelihood penalization of the data, or by using Bayesian models

Penalized likelihood

Bayesian modelling

Our approach

Notation

The model

Posterior distribution

Links with spike-and-slab models

Inference strategy and relaxation

E-step

M-step

Links with automatic relevance determination

Model selection

Occam’s Razor

Prediction

Initialization

Computational cost

XZ2XT α

Path of Models

Simulation setup

An introductory example

Benchmark study on simulated data

Study on classical regression data sets

Predicting a touristic index using open data

The “OrsayVelib” database

Results

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Journal of Multivariate Analysis	Publication Date: Jan 29, 2015
Citations: 38	License type: publisher-specific-oa

R Discovery Prime

R Discovery Prime

Combining a relaxed EM algorithm with Occam’s razor for Bayesian variable selection in high-dimensional regression

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Journal of Multivariate Analysis

Lead the way for us

Similar Papers

Laplace Approximation in High-Dimensional Bayesian Regression
Rina Foygel Barber ... Kean Ming Tan
-
Rina Foygel Barber, et. al.Rina Foygel Barber ... Kean Ming Tan
01 Jan 2015
01 Jan 2015

Adaptive variable selection in nonparametric sparse additive models
Cristina Butucea ... Natalia Stepanova
Electronic Journal of Statistics | VOL. 11
Cristina Butucea, et. al.Cristina Butucea ... Natalia Stepanova
01 Jan 2017
Electronic Journal of Statistics | VOL. 11

Combining Factor Models and Variable Selection in High-Dimensional Regression
Alois Kneip ... Pascal Sarda
-
Alois Kneip, et. al.Alois Kneip ... Pascal Sarda
01 Jan 2010
01 Jan 2010

Transfer learning for sparse variable selection in high-dimensional regression from quadratic measurement
Qingxu Shang ... Yunquan Song
Knowledge-Based Systems | VOL. 300
Qingxu Shang, et. al.Qingxu Shang ... Yunquan Song
25 Jun 2024
Knowledge-Based Systems | VOL. 300

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Combining a relaxed EM algorithm with Occam’s razor for Bayesian variable selection in high-dimensional regression

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Journal of Multivariate Analysis