Abstract

BackgroundIdentifying phenotypic correlations between complex traits and diseases can provide useful etiological insights. Restricted access to much individual-level phenotype data makes it difficult to estimate large-scale phenotypic correlation across the human phenome. Two state-of-the-art methods, metaCCA and LD score regression, provide an alternative approach to estimate phenotypic correlation using only genome-wide association study (GWAS) summary results.ResultsHere, we present an integrated R toolkit, PhenoSpD, to use LD score regression to estimate phenotypic correlations using GWAS summary statistics and to utilize the estimated phenotypic correlations to inform correction of multiple testing for complex human traits using the spectral decomposition of matrices (SpD). The simulations suggest that it is possible to identify nonindependence of phenotypes using samples with partial overlap; as overlap decreases, the estimated phenotypic correlations will attenuate toward zero and multiple testing correction will be more stringent than in perfectly overlapping samples. Also, in contrast to LD score regression, metaCCA will provide approximate genetic correlations rather than phenotypic correlation, which limits its application for multiple testing correction. In a case study, PhenoSpD using UK Biobank GWAS results suggested 399.6 independent tests among 487 human traits, which is close to the 352.4 independent tests estimated using true phenotypic correlation. We further applied PhenoSpD to an estimated 5,618 pair-wise phenotypic correlations among 107 metabolites using GWAS summary statistics from Kettunen's publication and PhenoSpD suggested the equivalent of 33.5 independent tests for these metabolites.ConclusionsPhenoSpD extends the use of summary-level results, providing a simple and conservative way to reduce dimensionality for complex human traits using GWAS summary statistics. This is particularly valuable in the age of large-scale biobank and consortia studies, where GWAS results are much more accessible than individual-level data.

Highlights

  • Phenotypic correlations between complex human traits and diseases can provide useful etiological insights into the understanding of mechanisms across the human phenome

  • Here, we present an integrated R toolkit, PhenoSpD, to use linkage disequilibrium (LD) score regression to estimate phenotypic correlations using genome-wide association study (GWAS) summary statistics and to utilize the estimated phenotypic correlations to inform correction of multiple testing for complex human traits using the spectral decomposition of matrices (SpD)

  • We present an integrative method, PhenoSpD, that allows phenotypic correlation estimation and multiple testing correction for human phenome using GWAS summary statistics

Read more

Summary

Introduction

Phenotypic correlations between complex human traits and diseases can provide useful etiological insights into the understanding of mechanisms across the human phenome. The bivariate LD score regression approach allows estimation of phenotypic correlation among the overlapping samples of two GWASs. Assuming the genetic and nongenetic components of two phenotypes are independent, the genetic covariance matrix (built up by the beta coefficients of the genetic association test) will capture the genetic effects, while the error covariance matrix (built up by the error term of the genetic association test) will capture the environmental (nongenetic) effects. Large-scale genetic association databases such as MR-Base [4] and LD Hub [5] have harmonized GWAS summary-level results for roughly 1,700 human traits. This provides a timely opportunity to estimate the phenotypic correlation structure across a wide range of high-dimensional, complex molecular traits, such as metabolites, that are potentially highly correlated. We combine LD score regression with SpD to estimate the number of independent tests using only summary-level GWAS data

Methods
Results
Discussion
Availability of supporting data
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call