Abstract

Polygenic risk scores (PRSs) have become an increasingly popular approach for demonstrating polygenic influences on complex traits and for establishing common polygenic signals between different traits. PRSs are typically constructed using pruning and thresholding (P+T), but the best choice of parameters is uncertain; thus multiple settings are used and the best is chosen. Optimization can lead to inflated Type I error. Permutation procedures can correct this, but they can be computationally intensive. Alternatively, a single parameter setting can be chosen a priori for the PRS, but choosing suboptimal settings results in loss of power. We propose computing PRSs under a range of parameter settings, performing principal component analysis (PCA) on the resulting set of PRSs, and using the first PRS-PC in association tests. The first PC reweights the variants included in the PRS to achieve maximum variation over all PRS settings used. Using simulations and a real data application to study PRS association with bipolar disorder and psychosis in bipolar disorder, we compare the performance of the proposed PRS-PCA approach with a permutation test and an a priori selected p-value threshold. The PRS-PCA approach is simple to implement, outperforms the other strategies in most scenarios, and provides an unbiased estimate of prediction performance.

Highlights

  • Polygenic risk scores (PRSs) have become an increasingly popular tool in genetics research

  • When K is large (e.g. K = 106), single nucleotide polymorphisms (SNPs) with larger p-values are down-weighted less heavily and the PRS-principal component analysis (PCA) weights are almost proportional to weights for the PRS with p-value threshold of 1

  • We proposed a method of PRS analysis that uses PCA to concentrate the maximum variation in a set of PRSs in a single PC, and tests for association of the phenotype with only the first PC

Read more

Summary

Introduction

Polygenic risk scores (PRSs) have become an increasingly popular tool in genetics research. PRSs leverage summary statistics from previous genome-wide association studies (GWASs) to predict risk for individuals in a new population. If the individuals’ predicted risk is associated with their phenotype, this approach provides evidence of polygenetic effect even when no genome-wide significant variants exist. When a PRS for one trait is associated with another trait, this approach can be used to establish common polygenic signals between two different traits. A simple summation across single nucleotide polymorphisms (SNPs) while ignoring the linkage disequilibrium (LD) among them would not be appropriate because trait-associated regions with high LD would be over-weighted. The most common approach, the so-called “pruning-and-thresholding” (P+T) method, constructs the PRS by first removing SNPs in high LD to obtain a set of roughly independent

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.