Abstract

BackgroundGenome-wide association studies involve detecting association between millions of genetic variants and a trait, which typically use univariate regression to test association between each single variant and the phenotype. Alternatively, Lasso penalized regression allows one to jointly model the relationship between all genetic variants and the phenotype. However, it is unclear how to best conduct inference on the individual Lasso coefficients, especially in high-dimensional settings.MethodsWe consider six methods for testing the Lasso coefficients: two permutation (Lasso-Ayers, Lasso-PL) and one analytic approach (Lasso-AL) to select the penalty parameter for type-1-error control, residual bootstrap (Lasso-RB), modified residual bootstrap (Lasso-MRB), and a permutation test (Lasso-PT). Methods are compared via simulations and application to the Minnesota Center for Twins and Family Study.ResultsWe show that for finite sample sizes with increasing number of null predictors, Lasso-RB, Lasso-MRB, and Lasso-PT fail to be viable methods of inference. However, Lasso-PL and Lasso-AL remain fast and powerful tools for conducting inference with the Lasso, even in high-dimensions.ConclusionOur results suggest that the proposed permutation selection procedure (Lasso-PL) and the analytic selection method (Lasso-AL) are fast and powerful alternatives to the standard univariate analysis in genome-wide association studies.

Highlights

  • Genome-wide association studies involve detecting association between millions of genetic variants and a trait, which typically use univariate regression to test association between each single variant and the phenotype

  • A Genome-wide association studies (GWASs) can be viewed as a high-dimensional variable selection problem with the goal of finding single nucleotide polymorphisms (SNPs) that are significantly associated with a phenotype of interest

  • We propose a modified version of the permutation method of Ayers and Cordell, and compare it with the original method, as well as the analytic method of Yi et al Other recently proposed methods of inference for penalized regression not considered in this paper are as follows: Zhang [10] and Javanmard [11] use a “debiased” Lasso, which attempts to remove the bias in the Lasso coefficients, constructs normal-based confidence intervals using the transformed coefficients

Read more

Summary

Introduction

Genome-wide association studies involve detecting association between millions of genetic variants and a trait, which typically use univariate regression to test association between each single variant and the phenotype. Lasso penalized regression allows one to jointly model the relationship between all genetic variants and the phenotype. It is unclear how to best conduct inference on the individual Lasso coefficients, especially in high-dimensional settings. Genome-wide association studies (GWASs) involve studying association between millions of genetic variants, called “single nucleotide polymorphisms (SNPs)” and different traits of interest. Jointly modeling all SNPs may lead to more accurate inference due to decreased residual variance in the phenotype of interest. Developing valid methods for conducting inference on the penalized regression coefficients remains an open area of research

Objectives
Methods
Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.