Abstract

Pleiotropy (i.e., genes with effects on multiple traits) leads to genetic correlations between traits and contributes to the development of many syndromes. Identifying variants with pleiotropic effects on multiple health-related traits can improve the biological understanding of gene action and disease etiology, and can help to advance disease-risk prediction. Sequential testing is a powerful approach for mapping genes with pleiotropic effects. However, the existing methods and the available software do not scale to analyses involving millions of SNPs and large datasets. This has limited the adoption of sequential testing for pleiotropy mapping at large scale. In this study, we present a sequential test and software that can be used to test pleiotropy in large systems of traits with biobank-sized data. Using simulations, we show that the methods implemented in the software are powerful and have adequate type-I error rate control. To demonstrate the use of the methods and software, we present a whole-genome scan in search of loci with pleiotropic effects on seven traits related to metabolic syndrome (MetS) using UK-Biobank data (n~300 K distantly related white European participants). We found abundant pleiotropy and report 170, 44, and 18 genomic regions harboring SNPs with pleiotropic effects in at least two, three, and four of the seven traits, respectively. We validate our results using previous studies documented in the GWAS-catalog and using data from GTEx. Our results confirm previously reported loci and lead to several novel discoveries that link MetS-related traits through plausible biological pathways.

Highlights

  • In this study we describe the methodology implemented in pleiotest, present extensive simulations that show that the proposed approximation has the same power and error control than the original test, and provide a benchmark that shows that our method is orders of magnitude faster than the sequence of likelihood-ratio tests (sLRT) and scales well to analyses involving many traits (e.g., 10 or more) and very large sample sizes (e.g., n > 300 K; K = 1000)

  • The simulation results (Table 1), which were based on 500 million Monte Carlo (MC)-replicates, showed that both sLRT and pleiotest had highly accurate type I error rate control

  • TRI was the trait with the largest number of SNPs with simultaneous significant associations with it and at least another trait (1953); URA was the trait most often involved in regions exhibiting pleiotropic effects (99 segments involving SNPs with pleiotropic effects harbored SNPs with significant effects for URA)

Read more

Summary

Introduction

Many human diseases (e.g., hypertension, gout, and diabetes) cluster into syndromes. Evidence from quantitative genetic studies [1, 2] and from genome-wide association (GWA) analyses [3] suggest that pleiotropy (i.e., variants with simultaneous effects on several traits) is an important. To confront the challenges posed by the analysis of systems of many traits, some authors considered using phenotype-derived principal components (PCs) as traits in GWA analyses This approach has been used to identify variants associated with patterns shared across multiple MetS-related traits [7]; this approach has several limitations. In this study we describe the methodology implemented in pleiotest, present extensive simulations that show that the proposed approximation has the same power and error control than the original test, and provide a benchmark that shows that our method is orders of magnitude faster than the sLRT and scales well to analyses involving many traits (e.g., 10 or more) and very large sample sizes (e.g., n > 300 K; K = 1000).

Statistical methods
Results
Discussion
Compliance with ethical standards

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.