Evaluation of a LASSO regression approach on the unrelated samples of Genetic Analysis Workshop 17

Wei Guo,Xiaofeng Zhu,Robert C Elston

doi:10.1186/1753-6561-5-s9-s12

Abstract

The Genetic Analysis Workshop 17 data we used comprise 697 unrelated individuals genotyped at 24,487 single-nucleotide polymorphisms (SNPs) from a mini-exome scan, using real sequence data for 3,205 genes annotated by the 1000 Genomes Project and simulated phenotypes. We studied 200 sets of simulated phenotypes of trait Q2. An important feature of this data set is that most SNPs are rare, with 87% of the SNPs having a minor allele frequency less than 0.05. For rare SNP detection, in this study we performed a least absolute shrinkage and selection operator (LASSO) regression and F tests at the gene level and calculated the generalized degrees of freedom to avoid any selection bias. For comparison, we also carried out linear regression and the collapsing method, which sums the rare SNPs, modified for a quantitative trait and with two different allele frequency thresholds. The aim of this paper is to evaluate these four approaches in this mini-exome data and compare their performance in terms of power and false positive rates. In most situations the LASSO approach is more powerful than linear regression and collapsing methods. We also note the difficulty in determining the optimal threshold for the collapsing method and the significant role that linkage disequilibrium plays in detecting rare causal SNPs. If a rare causal SNP is in strong linkage disequilibrium with a common marker in the same gene, power will be much improved.

Highlights

With the rapid development of technologies, more and more single-nucleotide polymorphisms (SNPs) have become available and, in particular, most of the rare variants can be identified using the next-generation sequencing technique
Current approaches for testing rare variants include grouping the rare variants based on a threshold of the minor allele frequency (MAF) [1], summing the rare variants weighted by the allele frequencies in control subjects [2,3], and clustering rare haplotypes using family data [4]
least absolute shrinkage and selection operator (LASSO) regression To deal with the singular matrix in linear regression caused by the rare variants, we adopt a statistical method that effectively shrinks the coefficients of unassociated SNPs and reduces the variance of the estimated regression coefficients

Summary

Introduction

With the rapid development of technologies, more and more single-nucleotide polymorphisms (SNPs) have become available and, in particular, most of the rare variants can be identified using the next-generation sequencing technique. Detecting associated rare variants that contribute to phenotypic variation is still a huge challenge. Current approaches for testing rare variants include grouping the rare variants based on a threshold of the minor allele frequency (MAF) [1], summing the rare variants weighted by the allele frequencies in control subjects [2,3], and clustering rare haplotypes using family data [4]. Another approach is to use a penalized regression, which can avoid the singular design matrix that may result from rare variants by

Objectives

Methods

Results

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: BMC Proceedings	Publication Date: Nov 29, 2011
Citations: 14	License type: CC BY 2.0

R Discovery Prime

R Discovery Prime

Evaluation of a LASSO regression approach on the unrelated samples of Genetic Analysis Workshop 17

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Proceedings

Lead the way for us

Similar Papers

Genetics and biology of asthma 2010: La' ci darem la mano…
Donata Vercelli
The Journal of Allergy and Clinical Immunology | VOL. 125
Donata VercelliDonata Vercelli
01 Feb 2010
Genetics and biology of asthma 2010: La' ci darem la mano…
Donata Vercelli

Functional and Structural Consequence of Rare Exonic Single Nucleotide Polymorphisms: One Story, Two Tales.
Wanjun Gu ... Jin J Zhou
Genome Biology and Evolution | VOL. 7
Wanjun Gu, et. al.Wanjun Gu ... Jin J Zhou
01 Oct 2015
Genome Biology and Evolution | VOL. 7

Abstract 3811: Rare variants at 16p11.2 and within TP53 influence neuroblastoma susceptibility.
Sharon J Diskin ... Mario Capasso
Cancer Research | VOL. 73
Sharon J Diskin, et. al.Sharon J Diskin ... Mario Capasso
15 Apr 2013
Abstract 3811: Rare variants at 16p11.2 and within TP53 influence neuroblastoma susceptibility.
Sharon J Diskin ... Mario Capasso

Exploration of the β2-adrenergic receptor regulatory regions: the next step in the holy grail of asthma pharmacogenetics research
Paul E Moore
American Journal of Physiology-Lung Cellular and Molecular Physiology | VOL. 294
Paul E MoorePaul E Moore
07 Dec 2007
American Journal of Physiology-Lung Cellular and Molecular Physiology | VOL. 294

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Evaluation of a LASSO regression approach on the unrelated samples of Genetic Analysis Workshop 17

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Proceedings