The spike-and-slab lasso and scalable algorithm to accommodate multinomial outcomes in variable selection problems

Inmaculada Aban,The Alzheimer's Disease Neuroimaging Initiative The Alzheimer'S Disease Neuroimaging Initiative,Nengjun Yi,Justin M Leach

doi:10.1080/02664763.2023.2258301

Inmaculada Aban, The Alzheimer's Disease Neuroimaging Initiative The Alzheimer'S Disease Neuroimaging Initiative + Show 2 more

https://doi.org/10.1080/02664763.2023.2258301

Copy DOI

Abstract

Spike-and-slab prior distributions are used to impose variable selection in Bayesian regression-style problems with many possible predictors. These priors are a mixture of two zero-centered distributions with differing variances, resulting in different shrinkage levels on parameter estimates based on whether they are relevant to the outcome. The spike-and-slab lasso assigns mixtures of double exponential distributions as priors for the parameters. This framework was initially developed for linear models, later developed for generalized linear models, and shown to perform well in scenarios requiring sparse solutions. Standard formulations of generalized linear models cannot immediately accommodate categorical outcomes with > 2 categories, i.e. multinomial outcomes, and require modifications to model specification and parameter estimation. Such modifications are relatively straightforward in a Classical setting but require additional theoretical and computational considerations in Bayesian settings, which can depend on the choice of prior distributions for the parameters of interest. While previous developments of the spike-and-slab lasso focused on continuous, count, and/or binary outcomes, we generalize the spike-and-slab lasso to accommodate multinomial outcomes, developing both the theoretical basis for the model and an expectation-maximization algorithm to fit the model. To our knowledge, this is the first generalization of the spike-and-slab lasso to allow for multinomial outcomes.

Full Text