Abstract

BackgroundThe area under the ROC curve (AUC) of risk models is known to be influenced by differences in case-mix and effect size of predictors. The impact of heterogeneity in correlation among predictors has however been under investigated. We sought to evaluate how correlation among predictors affects the AUC in development and external populations.MethodsWe simulated hypothetical populations using two different methods based on means, standard deviations, and correlation of two continuous predictors. In the first approach, the distribution and correlation of predictors were assumed for the total population. In the second approach, these parameters were modeled conditional on disease status. In both approaches, multivariable logistic regression models were fitted to predict disease risk in individuals. Each risk model developed in a population was validated in the remaining populations to investigate external validity.ResultsFor both approaches, we observed that the magnitude of the AUC in the development and external populations depends on the correlation among predictors. Lower AUCs were estimated in scenarios of both strong positive and negative correlation, depending on the direction of predictor effects and the simulation method. However, when adjusted effect sizes of predictors were specified in the opposite directions, increasingly negative correlation consistently improved the AUC. AUCs in external validation populations were higher or lower than in the derivation cohort, even in the presence of similar predictor effects.ConclusionsDiscrimination of risk prediction models should be assessed in various external populations with different correlation structures to make better inferences about model generalizability.

Highlights

  • The area under the ROC curve (AUC) of risk models is known to be influenced by differences in case-mix and effect size of predictors

  • Model development Approach I a) When effects of the predictors pointed in the same direction, an increasingly positive correlation coefficient caused distributions of the predictors among cases and controls to be more separated from each other; the standard deviation (SD) of the linear predictor (LP) increased

  • Only correlation among predictors was varied in population A-E (Table 1) and the estimated AUC was lowest (0.64) in population C with a minimum correlation of -0.2

Read more

Summary

Introduction

The area under the ROC curve (AUC) of risk models is known to be influenced by differences in case-mix and effect size of predictors. Previous simulation studies have shown how the AUC is impacted by a different distribution of subject characteristics, including disease severity or occurrence (i.e., differences in “case-mix”) and heterogeneity in the effect sizes of risk factors among development and validation samples [15, 16]. These studies concluded that both differences in case-mix and predictor effects between derivation and validation populations must be assessed to fully appreciate the external validation results.

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call