Estimating the predictive ability of genetic risk models in simulated data based on published results from genome-wide association studies.

Suman Kundu,Raluca Mihaescu,A Cecile J W Janssens,Rachel Bakker,Catherina M C Meijer

doi:10.3389/fgene.2014.00179

Abstract

Background: There is increasing interest in investigating genetic risk models in empirical studies, but such studies are premature when the expected predictive ability of the risk model is low. We assessed how accurately the predictive ability of genetic risk models can be estimated in simulated data that are created based on the odds ratios (ORs) and frequencies of single-nucleotide polymorphisms (SNPs) obtained from genome-wide association studies (GWASs).Methods: We aimed to replicate published prediction studies that reported the area under the receiver operating characteristic curve (AUC) as a measure of predictive ability. We searched GWAS articles for all SNPs included in these models and extracted ORs and risk allele frequencies to construct genotypes and disease status for a hypothetical population. Using these hypothetical data, we reconstructed the published genetic risk models and compared their AUC values to those reported in the original articles.Results: The accuracy of the AUC values varied with the method used for the construction of the risk models. When logistic regression analysis was used to construct the genetic risk model, AUC values estimated by the simulation method were similar to the published values with a median absolute difference of 0.02 [range: 0.00, 0.04]. This difference was 0.03 [range: 0.01, 0.06] and 0.05 [range: 0.01, 0.08] for unweighted and weighted risk scores.Conclusions: The predictive ability of genetic risk models can be estimated using simulated data based on results from GWASs. Simulation methods can be useful to estimate the predictive ability in the absence of empirical data and to decide whether empirical investigation of genetic risk models is warranted.

Highlights

Empirical studies on genetic risk models for multifactorial diseases so far show that the predictive ability is moderate at best (Willems et al, 2011; Husing et al, 2012), with a few promising exceptions (Maller et al, 2006; Romanos et al, 2009)
When logistic regression analysis was used to construct the genetic risk model, area under the receiver-operating characteristic curve (AUC) values estimated by the simulation method were similar to the published values with a median absolute difference of 0.02 [range: 0.00, 0.04]
The predictive ability of genetic risk models can be estimated using simulated data based on results from genome-wide association studies (GWASs)

Summary

Introduction

Empirical studies on genetic risk models for multifactorial diseases so far show that the predictive ability is moderate at best (Willems et al, 2011; Husing et al, 2012), with a few promising exceptions (Maller et al, 2006; Romanos et al, 2009). These methods all assess the predictive ability as the degree to which the risk model discriminates between patients and nonpatients, quantified as the area under the receiver operating characteristic (ROC) curve (AUC) Using epidemiological parameters such as a population-average risk of disease and the odds ratios (ORs) and frequencies of the genetic variants in the model, these methods obtain the AUC by simulating a dataset for a hypothetical population (Janssens et al, 2006; Pepe et al, 2010) or by using analytical formulas (Gail, 2008; Lu and Elston, 2008; Moonesinghe et al, 2010). We assessed how accurately the predictive ability of genetic risk models can be estimated in simulated data that are created based on the odds ratios (ORs) and frequencies of single-nucleotide polymorphisms (SNPs) obtained from genome-wide association studies (GWASs)

Objectives

Methods

Results

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Frontiers in Genetics	Publication Date: Jun 13, 2014
Citations: 12	License type: cc-by

R Discovery Prime

R Discovery Prime

Estimating the predictive ability of genetic risk models in simulated data based on published results from genome-wide association studies.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Frontiers in Genetics

Lead the way for us

Similar Papers

Genome-wide association study based risk prediction model in predicting lung cancer risk in Chinese
Juncheng Dai ... Dongxin Lin
Zhonghua liu xing bing xue za zhi = Zhonghua liuxingbingxue zazhi | VOL. 36
Juncheng Dai, et. al.Juncheng Dai ... Dongxin Lin
01 Oct 2015
Zhonghua liu xing bing xue za zhi = Zhonghua liuxingbingxue zazhi | VOL. 36

A Personal Breast Cancer Risk Stratification Model Using Common Variants and Environmental Risk Factors in Japanese Females.
Isao Oze ... Yoshio Kasuga
Cancers | VOL. 13
Isao Oze, et. al.Isao Oze ... Yoshio Kasuga
28 Jul 2021
Cancers | VOL. 13

Assessment of Improved Prediction Beyond Traditional Risk Factors
A Cecile J.W Janssens ... Muin J Khoury
Circulation. Cardiovascular genetics | VOL. 3
A Cecile J.W Janssens, et. al.A Cecile J.W Janssens ... Muin J Khoury
01 Feb 2010
Circulation. Cardiovascular genetics | VOL. 3

Genome-Wide Association Studies Go Green: Novel and Cost-Effective Opportunities for Identifying Genetic Associations
Celine M Vachon
Mayo Clinic Proceedings | VOL. 86
Celine M VachonCeline M Vachon
01 Jul 2011
Mayo Clinic Proceedings | VOL. 86

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Estimating the predictive ability of genetic risk models in simulated data based on published results from genome-wide association studies.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Frontiers in Genetics