Abstract

To date, gene-based rare variant testing approaches have focused on aggregating information across sets of variants to maximize statistical power in identifying genes showing significant association with diseases. Beyond identifying genes that are associated with diseases, the identification of causal variant(s) in those genes and estimation of their effect is crucial for planning replication studies and characterizing the genetic architecture of the locus. However, we illustrate that straightforward single-marker association statistics can suffer from substantial bias introduced by conditioning on gene-based test significance, due to the phenomenon often referred to as “winner's curse.” We illustrate the ramifications of this bias on variant effect size estimation and variant prioritization/ranking approaches, outline parameters of genetic architecture that affect this bias, and propose a bootstrap resampling method to correct for this bias. We find that our correction method significantly reduces the bias due to winner's curse (average two-fold decrease in bias, p < 2.2 × 10−6) and, consequently, substantially improves mean squared error and variant prioritization/ranking. The method is particularly helpful in adjustment for winner's curse effects when the initial gene-based test has low power and for relatively more common, non-causal variants. Adjustment for winner's curse is recommended for all post-hoc estimation and ranking of variants after a gene-based test. Further work is necessary to continue seeking ways to reduce bias and improve inference in post-hoc analysis of gene-based tests under a wide variety of genetic architectures.

Highlights

  • Results shown for two pairs of Step 1 and Step 2 test statistics: Qbw followed by Di and Qsw followed by D2i . aBias is computed as the average difference between the estimated single-marker post-hoc statistic and its expected value

  • MSE is computed as the average squared difference between the estimated single-marker post-hoc statistic and its expected value. bComputed as the percent of variants for which bias decreased after implementing our bootstrap bias-correction strategy. cThe median change in bias among variants that show an improvement after adjustment (i.e., bias is smaller after adjustment). dThe median change in bias among variants that show a decline after adjustment (i.e., bias is larger after adjustment). eThe ratio of the previous two columns

  • Winner’s curse is problematic for Step 2 of gene based testing situations when the Step 1 power is low. As demonstrated both theoretically and via simulation, there is a direct relationship between minor allele frequency and bias/MSE, leading unadjusted post-hoc statistics (Di and D 2i ) to be prone to over-estimation for more common, non-causal variants

Read more

Summary

INTRODUCTION

Numerous gene-based rare variant tests of association (hereafter GBTs) have been proposed that seek to aggregate genotype-phenotype association signals across rare variants within a gene to improve the overall evidence of genotype-phenotype association within a gene of interest (Li and Leal, 2008; Madsen and Browning, 2009; Morris and Zeggini, 2010; Price et al, 2010; Zawistowski et al, 2010; Ionita-Laza et al, 2011; Pan and Shen, 2011; Wu et al, 2011; Lee et al, 2012; Greco et al, 2016). Recent work (Liu and Leal, 2012) has documented the phenomenon in estimates of average genetic effect after GBTs, but does not explore potential bias in single-marker test statistics, which is of particular concern given the low power of analysis strategies involving rare variants. There exist bootstrap resampling bias correction methods, which use bootstrap resampling to estimate the bias of a “naïve” (winner’s-curse-afflicted) estimator and define a bias-corrected estimator by subtracting the estimated bias from the naïve estimator Many variations of this general bootstrap resampling bias correction framework have been proposed and applied in the context of estimation of singlemarker effect sizes for common variants (Sun and Bull, 2005; Yu et al, 2007; Sun et al, 2011; Xu et al, 2011; Faye et al, 2013; Zhou and Wright, 2015), and more recently to estimation of the average genetic effect for a GBT (Liu and Leal, 2012). We propose a bootstrap resampling and estimation strategy that adjusts the estimates of individual rare variant effects and leads to improved post-hoc variant prioritization

METHODS
RESULTS
DISCUSSION
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call