Abstract
BackgroundAs several rare genomic variants have been shown to affect common phenotypes, rare variants association analysis has received considerable attention. Several efficient association tests using genotype and phenotype similarity measures have been proposed in the literature. The major advantages of similarity-based tests are their ability to accommodate multiple types of DNA variations within one association test, and to account for the possible interaction within a region. However, not much work has been done to compare the performance of similarity-based tests on rare variants association scenarios, especially when applied with different rare variants pooling strategies.ResultsBased on the population genetics simulations and analysis of a publicly-available sequencing data set, we compared the performance of four similarity-based tests and two rare variants pooling strategies. We showed that weighting approach outperforms collapsing under the presence of strong effect from rare variants and under the presence of moderate effect from common variants, whereas collapsing of rare variants is preferable when common variants possess a strong effect. We also demonstrated that the difference in statistical power between the two pooling strategies may be substantial. The results also highlighted consistently high power of two similarity-based approaches when applied with an appropriate pooling strategy.ConclusionsPopulation genetics simulations and sequencing data set analysis showed high power of two similarity-based tests and a substantial difference in power between the two pooling strategies.
Highlights
As several rare genomic variants have been shown to affect common phenotypes, rare variants association analysis has received considerable attention
Methods based on genotype similarity include the following: sequence kernel association test (SKAT) [11]; kernel-based association test (KBAT) [18], multivariate distance matrix regression test (MDMR) [19]; and aggregate U-test [20]
Population genetic simulations For each test, 1000 permutations were performed to assess the significance of association
Summary
As several rare genomic variants have been shown to affect common phenotypes, rare variants association analysis has received considerable attention. The major advantages of similarity-based tests are their ability to accommodate multiple types of DNA variations within one association test, and to account for the possible interaction within a region. One of the major advantages of similarity-based tests is the ability to accommodate multiple types of DNA variations (SNPs, insertions and deletions, CNV) observed within a region, given flexibility in the choice of similarity measures between two sequences [13]. Another issue that similarity-based tests address is the possible interaction of different variants within a region, which is potentially accounted for by considering multi-site similarity measures [14]. It is unclear which pooling strategy is the best to be applied with similarity-based tests
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have