Abstract

BackgroundIt is often unclear which approach to fit, assess and adjust a model will yield the most accurate prediction model. We present an extension of an approach for comparing modelling strategies in linear regression to the setting of logistic regression and demonstrate its application in clinical prediction research.MethodsA framework for comparing logistic regression modelling strategies by their likelihoods was formulated using a wrapper approach. Five different strategies for modelling, including simple shrinkage methods, were compared in four empirical data sets to illustrate the concept of a priori strategy comparison. Simulations were performed in both randomly generated data and empirical data to investigate the influence of data characteristics on strategy performance. We applied the comparison framework in a case study setting. Optimal strategies were selected based on the results of a priori comparisons in a clinical data set and the performance of models built according to each strategy was assessed using the Brier score and calibration plots.ResultsThe performance of modelling strategies was highly dependent on the characteristics of the development data in both linear and logistic regression settings. A priori comparisons in four empirical data sets found that no strategy consistently outperformed the others. The percentage of times that a model adjustment strategy outperformed a logistic model ranged from 3.9 to 94.9 %, depending on the strategy and data set. However, in our case study setting the a priori selection of optimal methods did not result in detectable improvement in model performance when assessed in an external data set.ConclusionThe performance of prediction modelling strategies is a data-dependent process and can be highly variable between data sets within the same clinical domain. A priori strategy comparison can be used to determine an optimal logistic regression modelling strategy for a given data set before selecting a final modelling approach.Electronic supplementary materialThe online version of this article (doi:10.1186/s12874-016-0209-0) contains supplementary material, which is available to authorized users.

Highlights

  • It is often unclear which approach to fit, assess and adjust a model will yield the most accurate prediction model

  • It has been shown that for linear regression the success of a strategy is heavily influenced by a few key data characteristics, and in order to address this a framework was proposed for the a priori comparison of different model building strategies in a given data set [17]

  • We present an extended framework for comparing strategies in linear and logistic regression model building

Read more

Summary

Introduction

It is often unclear which approach to fit, assess and adjust a model will yield the most accurate prediction model. We present an extension of an approach for comparing modelling strategies in linear regression to the setting of logistic regression and demonstrate its application in clinical prediction research. Logistic regression models are frequently utilized in clinical prediction research and have a range of applications [1,2,3,4]. Despite great efforts to present clear guidelines for the prediction model building process [14,15,16] it may still be unclear to researchers which modelling approach is most likely to yield a model with optimal external performance. It has been shown that for linear regression the success of a strategy is heavily influenced by a few key data characteristics, and in order to address this a framework was proposed for the a priori comparison of different model building strategies in a given data set [17]

Methods
Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.