Abstract

Risk prediction that capitalizes on emerging genetic findings holds great promise for improving public health and clinical care. However, recent risk prediction research has shown that predictive tests formed on existing common genetic loci, including those from genome-wide association studies, have lacked sufficient accuracy for clinical use. Because most rare variants on the genome have not yet been studied for their role in risk prediction, future disease prediction discoveries should shift toward a more comprehensive risk prediction strategy that takes into account both common and rare variants. We are proposing a collapsing receiver operating characteristic (CROC) approach for risk prediction research on both common and rare variants. The new approach is an extension of a previously developed forward ROC (FROC) approach, with additional procedures for handling rare variants. The approach was evaluated through the use of 533 single-nucleotide polymorphisms (SNPs) in 37 candidate genes from the Genetic Analysis Workshop 17 mini-exome data set. We found that a prediction model built on all SNPs gained more accuracy (AUC = 0.605) than one built on common variants alone (AUC = 0.585). We further evaluated the performance of two approaches by gradually reducing the number of common variants in the analysis. We found that the CROC method attained more accuracy than the FROC method when the number of common variants in the data decreased. In an extreme scenario, when there are only rare variants in the data, the CROC reached an AUC value of 0.603, whereas the FROC had an AUC value of 0.524.

Highlights

  • The completion of hundreds of genome-wide association studies has brought numerous novel disease susceptibility loci to light

  • We evaluated the performance of the collapsing receiver operating characteristic (CROC) approach using the simulated Genetic Analysis Workshop 17 (GAW17) mini-exome sequencing data

  • Using the GAW17 data, we investigated whether the accuracy of the risk prediction model could be improved by considering rare variants in the analysis

Read more

Summary

Introduction

The completion of hundreds of genome-wide association studies has brought numerous novel disease susceptibility loci to light. For many diseases the common variants that have been identified explain only a small proportion of disease heritability. Great attention has been given to the rare variants. Current genome-wide association studies include only single-nucleotide polymorphisms (SNPs) with a minor allele frequency (MAF) greater than 5% [1,2]. Within the few years, whole-genome sequencing will produce millions of rare variants, with the expectation that some of them might explain part of the missing heritability. Experimental studies have already shown that rare variants are associated with complex diseases, such as obesity [3], schizophrenia [4], and colorectal cancer [5]

Methods
Results
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.