Abstract

Many methods used in multi-locus genome-wide association studies (GWAS) have been developed to improve statistical power. However, most existing multi-locus methods are not quicker than single-locus methods. To address this concern, we proposed a fast score test integrated with Empirical Bayes (ScoreEB) for multi-locus GWAS. Firstly, a score test was conducted for each single nucleotide polymorphism (SNP) under a linear mixed model (LMM) framework, taking into account the genetic relatedness and population structure. Then, all of the potentially associated SNPs were selected with a less stringent criterion. Finally, Empirical Bayes in a multi-locus model was performed for all of the selected SNPs to identify the true quantitative trait nucleotide (QTN). Our new method ScoreEB adopts the similar strategy of multi-locus random-SNP-effect mixed linear model (mrMLM) and fast multi-locus random-SNP-effect EMMA (FASTmrEMMA), and the only difference is that we use the score test to select all the potentially associated markers. Monte Carlo simulation studies demonstrate that ScoreEB significantly improved the computational efficiency compared with the popular methods mrMLM, FASTmrEMMA, iterative modified-sure independence screening EM-Bayesian lasso (ISIS EM-BLASSO), hybrid of restricted and penalized maximum likelihood (HRePML) and genome-wide efficient mixed model association (GEMMA). In addition, ScoreEB remained accurate in QTN effect estimation and effectively controlled false positive rate. Subsequently, ScoreEB was applied to re-analyze quantitative traits in plants and animals. The results show that ScoreEB not only can detect previously reported genes, but also can mine new genes.

Highlights

  • Genome-wide association studies (GWAS) have become a powerful approach in the genetic dissection of quantitative traits in human, animal and plant genetics (Buniello et al, 2019; Jiang et al, 2019)

  • Each experiment was analyzed by six methods: a fast score test integrated with Empirical Bayes (ScoreEB), multi-locus randomSNP-effect mixed linear model, fast multi-locus random-single nucleotide polymorphism (SNP)-effect EMMA (FASTmrEMMA), iterative modified-sure independence screening EM-Bayesian lasso (ISIS EM-BLASSO), hybrid of restricted and penalized maximum likelihood (HRePML) and genome-wide efficient mixed model association (GEMMA)

  • In the first simulation experiment where six quantitative trait nucleotide (QTN) effects and an additive polygenic effect were involved, the area under the Power-FDR curve (AUC.FDR) for ScoreEB, multi-locus random-SNP-effect mixed linear model (mrMLM), FASTmrEMMA, iterative modifiedsure independence screening (ISIS) EM-BLASSO, HRePML and GEMMA methods were 0.4405, 0.4651, 0.4583, 0.4020, 0.4385 and 0.3358, respectively, showing that ScoreEB along with mrMLM and FASTmrEMMA has the similar power, which are significantly higher than GEMMA (Figure 1A)

Read more

Summary

Introduction

Genome-wide association studies (GWAS) have become a powerful approach in the genetic dissection of quantitative traits in human, animal and plant genetics (Buniello et al, 2019; Jiang et al, 2019). ScoreEB for Efficient GWAS et al, 2011), genome-wide efficient mixed model association (GEMMA) (Zhou and Stephens, 2012), BOLT-LMM (Loh et al, 2015), and the rapid and efficient linear mixed model approach using the score test (LMM-Score) (Chang et al, 2019b). These methods have successfully detected a number of variants among various traits, they still have some shortcomings. Most adopt single-locus screening, so that the combined effects of multiple loci are ignored and the threshold in multiple test correction is often difficult to determine (Wang et al, 2016; Ren et al, 2018; Wen et al, 2018)

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call