Abstract

High-dimensional variable selection is an important research topic in modern statistics. While methods using nonlocal priors have been thoroughly studied for variable selection in linear regression, the crucial high-dimensional model selection properties for nonlocal priors in generalized linear models have not been investigated. In this paper, we consider a hierarchical generalized linear regression model with the product moment nonlocal prior over coefficients and examine its properties. Under standard regularity assumptions, we establish strong model selection consistency in a high-dimensional setting, where the number of covariates is allowed to increase at a sub-exponential rate with the sample size. The Laplace approximation is implemented for computing the posterior probabilities and the shotgun stochastic search procedure is suggested for exploring the posterior space. The proposed method is validated through simulation studies and illustrated by a real data example on functional activity analysis in fMRI study for predicting Parkinson’s disease.

Highlights

  • With the increasing ability to collect and store data in large scales, we are facing the opportunities and challenges to analyze data with a large number of covariates per observation, the so-called high-dimensional problem

  • We will focus on logistic regression, but our argument can be extended to any other generalized linear model (GLM) as well

  • The dataset is randomly divided into a training set (80%) and a testing set (20%) while maintaining the Parkinson’s Disease (PD):healthy controls (HC) ratio in both sets

Read more

Summary

Introduction

With the increasing ability to collect and store data in large scales, we are facing the opportunities and challenges to analyze data with a large number of covariates per observation, the so-called high-dimensional problem When this situation arises, variable selection is one of the most commonly used techniques, especially in radiological and genetic research, due to the nature of high-dimensional data extracted from imaging scans and gene sequencing. Other penalization approaches for sparse model selection including smoothly clipped absolute deviation (SCAD) [2], minimum concave penalty (MCP) [3] and many variations have been introduced. Most of these methods are first considered in the context of linear regression and extended to generalized linear models. These Bayesian methods employed local priors, which still preserve positive values at null parameter values, to achieve desirable shrinkage

Objectives
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call