Abstract

Genome-wide association studies (GWAS) test for disease-trait associations and estimate effect sizes at tag single-nucleotide polymorphisms (SNPs), which imperfectly capture variation at causal SNPs. Sequencing studies can examine potential causal SNPs directly; however, sequencing the whole genome or exome can be prohibitively expensive. Costs can be limited by using a GWAS to detect the associated region(s) at tag SNPs followed by targeted sequencing to identify and estimate the effect size of the causal variant. Genetic effect estimates obtained from association studies can be inflated because of a form of selection bias known as the winner’s curse. Conversely, estimates at tag SNPs can be attenuated compared to the causal SNP because of incomplete linkage disequilibrium. These two effects oppose each other. Analysis of rare SNPs further complicates our understanding of the winner’s curse because rare SNPs are difficult to tag and analysis can involve collapsing over multiple rare variants. In two-stage analysis of Genetic Analysis Workshop 17 simulated data sets, we find that selection at the tag SNP produces upward bias in the estimate of effect at the causal SNP, even when the tag and causal SNPs are not well correlated. The bias similarly carries through to effect estimates for rare variant summary measures. Replication studies designed with sample sizes computed using biased estimates will be under-powered to detect a disease-causing variant. Accounting for bias in the original study is critical to avoid discarding disease-associated SNPs at follow up.

Highlights

  • Selection bias in genetic association studies arises when the same sample is used for both gene discovery and effect estimation

  • We examine three different two-stage scenarios described below in which first a genetic effect is detected at a tag singlenucleotide polymorphisms (SNPs) in a genome-wide association study (GWAS), the gene is sequenced to find the true causal SNP(s), and the genetic effect is estimated at the true causal SNP(s)

  • We present results for quantitative trait Q2 with causal SNP C6S5380 and tag SNPs from the HapMap data set that fall within the VNN1 gene, as defined by the gene information file provided with the Genetic Analysis Workshop 17 (GAW17) data set

Read more

Summary

Introduction

Selection bias in genetic association studies arises when the same sample is used for both gene discovery and effect estimation. Under the low power that is common in a genome-wide association study (GWAS), selection causes upward bias in the magnitude of genetic effect estimates because the effect size is estimated only when the test statistic exceeds the threshold for significance. This phenomenon is known as the winner’s curse, and its effect on linkage analyses and on case-control association was demonstrated by Goring et al [1] and Garner [2], respectively. The balance between these two trends determines the degree of bias in the estimates

Methods
Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.