Abstract

In genome-wide association studies (GWASs) for binary traits (or case-control samples) in the presence of covariates to be adjusted for, researchers often use a logistic regression model to test variants for disease association. Popular tests include Wald, likelihood ratio, and score tests. For likelihood ratio test and Wald test, maximum likelihood estimation (MLE), which requires iterative procedure, must be computed for each single nucleotide polymorphism (SNP). In contrast, the score test only requires MLE under the null model, being lower in computational cost than other tests. Usually, genotype data include missing genotypes because of assay failures. It loses computational efficiency in the conventional score test (CST), which requires null estimation by excluding individuals with missing genotype for each SNP. In this study, we propose two new score tests, called PM1 and PM2, that use a single global null estimator for all SNPs regardless of missing genotypes, thereby enabling faster computation than CST. We prove that PM2 and CST have an equivalent asymptotic power and that the power of PM1 is asymptotically lower than that of PM2. We evaluate the performance of the proposed methods in terms of type I error rates and power by simulation studies and application to real GWAS data provided by the Alzheimer’s Disease Neuroimaging Initiative (ADNI), confirming our theoretical results. ADNI-GWAS application demonstrated that the proposed score tests improve computational speed about 6–18 times faster than the existing tests, CST, Wald tests and likelihood ratio tests. Our score tests are general and applicable to other regression models.

Highlights

  • Over the last decades, genome-wide association studies (GWASs) have successfully identified many variants that are susceptible to hundreds of human diseases and traits [1, 2]

  • We presented two new fast score tests, proposed method 1 (PM1) and proposed method 2 (PM2), that require only a single global null estimator for all single nucleotide polymorphism (SNP) for genome-wide scan when missing genotypes are present

  • We confirmed that our proposed methods can significantly reduce the computational cost compared to conventional tests for genome-wide scans (e.g. Wald test in PLINK) in an application to Alzheimer’s Disease Neuroimaging Initiative (ADNI)-GWAS data

Read more

Summary

Introduction

Genome-wide association studies (GWASs) have successfully identified many variants that are susceptible to hundreds of human diseases and traits [1, 2]. For discovery of an association between disease and genotypes, researchers often use tests based on a logistic regression model. It can analyze an association between disease (binary trait) and each single nucleotide polymorphism (SNP) while adjusting for the effect of covariates including. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call