Abstract
Background/aimThe polygenic risk score (PRS) shows promise as a potentially effective approach to summarize genetic risk for complex diseases such as alcohol use disorder that is influenced by a combination of multiple variants, each of which has a very small effect. Yet, conventional PRS methods tend to over-adjust confounding factors in the discovery sample and thus have low power to predict the phenotype in the target sample. This study aims to address this important methodological issue.MethodsThis study proposed a new method to construct PRS by (1) approximating the polygenic model using a few principal components selected based on eigen-correlation in the discovery data; and (2) conducting principal component projection on the target data. Secondary data analysis was conducted on two large scale databases: the Study of Addiction: Genetics and Environment (SAGE; discovery data) and the National Longitudinal Study of Adolescent to Adult Health (Add Health; target data) to compare performance of the conventional and proposed methods.Result and conclusionThe results show that the proposed method has higher prediction power and can handle participants from different ancestry backgrounds. We also provide practical recommendations for setting the linkage disequilibrium (LD) and p value thresholds.
Highlights
Genome-wide association studies (GWAS) have been used to identify variants that are significantly associated with the phenotype of interest
Since we propose to use the principal components as predictors of the phenotypes, choosing variants significantly associated with the phenotypes and using these variants to derive the principal components would increase the association in Eq (5)
The average number of alcohol use disorder (AUD) symptoms (0.80–2.10 out of 11) in the Add Health dataset was low because the sample represented the general population
Summary
Genome-wide association studies (GWAS) have been used to identify variants that are significantly associated with the phenotype of interest. Many GWAS with small to moderate sample sizes fail to identify important variants even though the phenotype has been shown to be highly heritable. This phenomenon is called the “missing heritability problem” [1]. PRS derived its name from the notion that complex diseases are highly polygenic [3] with the effect of each variant being very small To deal with this issue, the PRS approach proposes an additive model to summarize the marginal effects of many variants to quantify genetic influences on a particular phenotype [4]. Wray et al [5] was the first study to apply the PRS approach in GWAS
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.