Abstract

Polygenic risk scores (PRS), which sum the effects of single nucleotide polymorphisms (SNPs) throughout the genome to measure risk afforded by common genetic variants, have been used to estimate disorder risk for ADHD, but the accuracy of risk prediction is very low. Our goal was to improve the predictive accuracy of PRS using machine learning and to use the results to make inferences about the genetic architecture of ADHD and comorbid disorders. We performed gene set analysis of genome-wide association study (GWAS) data to select gene sets associated with ADHD within a training subset. For each selected gene set, we generated gene set PRS (gsPRS), which sum the effects of SNPs for each selected gene set. We created gsPRS for ADHD and for disorders having a high genetic correlation with ADHD. These gsPRS were added to the standard PRS as input to machine learning models predicting ADHD. We used feature importance scores to select gsPRS for a final model and to generate a ranking of the most consistently predictive gsPRS. For a test subset that had not been used for training or validation, a random forest (RF) model had an area under the receiving operating characteristic curve (AUC) of 0.72 (95% CI, 0.70-0.74). This AUC was a statistically significant improvement over logistic regression models using only traditional or modern PRS scoring methods, logistic regression, and RF models using PRS from ADHD and genetically related disorders. Summing risk at the gene set level and incorporating genetic risk from disorders with high genetic correlations with ADHD improved the accuracy of predicting ADHD. Learning curves suggest that additional improvements would be expected with larger study sizes. Our study suggests that better accounting of genetic risk and the genetic context of allelic differences results in more predictive models.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.