Abstract

Simultaneous testing of multiple genetic variants for association is widely recognized as a valuable complementary approach to single-marker tests. As such, principal component regression (PCR) has been found to have competitive power. We focus on exploring a robust test for an unknown genetic mode of all SNPs, an unknown Hardy-Weinberg equilibrium (HWE) in a population, and a large number of all SNPs. First, we propose a new global test by means of the use of codominant codes for all markers and PCR. The new global test is built on an empirical Bayes-type score statistic for testing marginal associations with each single marker. The new global test gains power by robustly exploiting the Hardy-Weinberg equilibrium in the control population and effectively using linkage disequilibrium among test markers. The new global test reduces to PCR when the genotype for each marker is coded as the number of minor alleles. This connection lends insight into the power of the new global test relative to PCR and some other popular multimarker test methods. Second, we propose a robust test method based on the new global test and the ordinary PCR test built on a prospective score statistic for testing marginal associations with each single marker when the genotype for each marker is coded as the number of minor alleles by taking the minimum p value of these two tests. Finally, through extensive simulation studies and analysis of the association between pancreatic cancer and some genes of interest, we show that the proposed robust test method has desirable power and can often identify association signals that may be missed by existing methods.

Highlights

  • Association analyses that test multiple genetic markers as a set rather than individually have been appreciated for their potential power. These statistical methods largely fall into three classes: those for summarizing p values from the tests of each single marker [1,2,3,4,5], those that synthesize singlemarker test statistics, such as Hotelling T2 statistic [6,7,8] and the burden test [9, 10], and those based on a direct test of joint associations of multiple markers, such as variance component tests (VC) [11,12,13], the sequence kernel association test (SKAT) [14,15,16,17,18], and principal component regression (PCR) methods [19,20,21]

  • Kernel-machine-based tests make full use of possible correlations among score statistics, which is known to be advantageous for high-dimensional data [30], and are robust to the directions of association of different single-nucleotide polymorphisms (SNPs)

  • We find that when the linkage disequilibrium (LD) extent of each pair of SNPs is somewhat strong, principal component analysis methods may have higher power than kernel-machine-based tests

Read more

Summary

Introduction

Association analyses that test multiple genetic markers as a set rather than individually have been appreciated for their potential power. These statistical methods largely fall into three classes: those for summarizing p values from the tests of each single marker [1,2,3,4,5], those that synthesize singlemarker test statistics, such as Hotelling T2 (standard Chisquared) statistic [6,7,8] and the burden test [9, 10], and those based on a direct test of joint associations of multiple markers, such as variance component tests (VC) [11,12,13], the sequence kernel association test (SKAT) [14,15,16,17,18], and principal component regression (PCR) methods [19,20,21]. We focus on exploring a robust test for unknown genetic modes of SNPs of interest, unknown Hardy-Weinberg equilibrium (HWE) in a population, and a large number of SNPs of interest

Objectives
Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call