Selection In Linear Regression Research Articles

In molecular biology, advances in high-throughput technologies have made it possible to study complex multivariate phenotypes and their simultaneous associations with high-dimensional genomic and other omics data, a problem that can be studied with high-dimensional multi-response regression, where the response variables are potentially highly correlated. To this purpose, we recently introduced several multivariate Bayesian variable and covariance selection models, e.g., Bayesian estimation methods for sparse seemingly unrelated regression for variable and covariance selection. Several variable selection priors have been implemented in this context, in particular the hotspot detection prior for latent variable inclusion indicators, which results in sparse variable selection for associations between predictors and multiple phenotypes. We also propose an alternative, which uses a Markov random field (MRF) prior for incorporating prior knowledge about the dependence structure of the inclusion indicators. Inference of Bayesian seemingly unrelated regression (SUR) by Markov chain Monte Carlo methods is made computationally feasible by factorisation of the covariance matrix amongst the response variables. In this paper we present BayesSUR, an R package, which allows the user to easily specify and run a range of different Bayesian SUR models, which have been implemented in C++ for computational efficiency. The R package allows the specification of the models in a modular way, where the user chooses the priors for variable selection and for covariance selection separately. We demonstrate the performance of sparse SUR models with the hotspot prior and spike-and-slab MRF prior on synthetic and real data sets representing eQTL or mQTL studies and in vitro anti-cancer drug screening studies as examples for typical applications.

Read full abstract

In this paper, we propose a new estimation procedure for discovering the structure of Gaussian Markov random fields (MRFs) with false discovery rate (FDR) control, making use of the sorted ℓ 1 -norm (SL1) regularization. A Gaussian MRF is an acyclic graph representing a multivariate Gaussian distribution, where nodes are random variables and edges represent the conditional dependence between the connected nodes. Since it is possible to learn the edge structure of Gaussian MRFs directly from data, Gaussian MRFs provide an excellent way to understand complex data by revealing the dependence structure among many inputs features, such as genes, sensors, users, documents, etc. In learning the graphical structure of Gaussian MRFs, it is desired to discover the actual edges of the underlying but unknown probabilistic graphical model—it becomes more complicated when the number of random variables (features) p increases, compared to the number of data points n. In particular, when p ≫ n , it is statistically unavoidable for any estimation procedure to include false edges. Therefore, there have been many trials to reduce the false detection of edges, in particular, using different types of regularization on the learning parameters. Our method makes use of the SL1 regularization, introduced recently for model selection in linear regression. We focus on the benefit of SL1 regularization that it can be used to control the FDR of detecting important random variables. Adapting SL1 for probabilistic graphical models, we show that SL1 can be used for the structure learning of Gaussian MRFs using our suggested procedure nsSLOPE (neighborhood selection Sorted L-One Penalized Estimation), controlling the FDR of detecting edges.

Read full abstract

Selection In Linear Regression Research Articles

Articles published on Selection In Linear Regression

COMBSS: best subset selection via continuous optimization

Carousel Greedy Algorithms for Feature Selection in Linear Regression

Variable and model selection method in linear regression for analysis

An Approximated Collapsed Variational Bayes Approach to Variable Selection in Linear Regression

Bayesian variable selection for linear regression with the κ-G priors

Robust model selection using the out-of-bag bootstrap in linear regression

Bayes factor asymptotics for variable selection in the Gaussian process framework

BayesSUR: An R Package for High-Dimensional Multivariate Bayesian Variable and Covariance Selection in Linear Regression

Variable selection for linear regression in large databases: exact methods

Variable Selection Using Nonlocal Priors in High-Dimensional Generalized Linear Models With Application to fMRI Data Analysis.

A simple new approach to variable selection in regression, with application to genetic fine mapping.

Spike-and-Slab Group Lassos for Grouped Regression and Sparse Generalized Additive Models

Structure Learning of Gaussian Markov Random Fields with False Discovery Rate Control

An efficient optimization approach for best subset selection in linear regression, with application to model selection and fitting in autoregressive time-series

Lasso-based index tracking and statistical arbitrage long-short strategies

Mixed integer nonlinear goal programming approach to variable selection in linear regression

A Loss-Based Prior for Variable Selection in Linear Regression Methods

A two-step approach for variable selection in linear regression with measurement error

An MCMC approach to empirical Bayes inference and Bayesian sensitivity analysis via empirical processes

Exhaustive Search for Sparse Variable Selection in Linear Regression

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Selection In Linear Regression Research Articles

Articles published on Selection In Linear Regression

COMBSS: best subset selection via continuous optimization

Carousel Greedy Algorithms for Feature Selection in Linear Regression

Variable and model selection method in linear regression for analysis

An Approximated Collapsed Variational Bayes Approach to Variable Selection in Linear Regression

Bayesian variable selection for linear regression with the κ-G priors

Robust model selection using the out-of-bag bootstrap in linear regression

Bayes factor asymptotics for variable selection in the Gaussian process framework

BayesSUR: An R Package for High-Dimensional Multivariate Bayesian Variable and Covariance Selection in Linear Regression

Variable selection for linear regression in large databases: exact methods

Variable Selection Using Nonlocal Priors in High-Dimensional Generalized Linear Models With Application to fMRI Data Analysis.

A simple new approach to variable selection in regression, with application to genetic fine mapping.

Spike-and-Slab Group Lassos for Grouped Regression and Sparse Generalized Additive Models

Structure Learning of Gaussian Markov Random Fields with False Discovery Rate Control

An efficient optimization approach for best subset selection in linear regression, with application to model selection and fitting in autoregressive time-series

Lasso-based index tracking and statistical arbitrage long-short strategies

Mixed integer nonlinear goal programming approach to variable selection in linear regression

A Loss-Based Prior for Variable Selection in Linear Regression Methods

A two-step approach for variable selection in linear regression with measurement error

An MCMC approach to empirical Bayes inference and Bayesian sensitivity analysis via empirical processes

Exhaustive Search for Sparse Variable Selection in Linear Regression