Abstract
Clinical prediction models (CPMs) can predict clinically relevant outcomes or events. Typically, prognostic CPMs are derived to predict the risk of a single future outcome. However, there are many medical applications where two or more outcomes are of interest, meaning this should be more widely reflected in CPMs so they can accurately estimate the joint risk of multiple outcomes simultaneously. A potentially naïve approach to multi-outcome risk prediction is to derive a CPM for each outcome separately, then multiply the predicted risks. This approach is only valid if the outcomes are conditionally independent given the covariates, and it fails to exploit the potential relationships between the outcomes. This paper outlines several approaches that could be used to develop CPMs for multiple binary outcomes. We consider four methods, ranging in complexity and conditional independence assumptions: namely, probabilistic classifier chain, multinomial logistic regression, multivariate logistic regression, and a Bayesian probit model. These are compared with methods that rely on conditional independence: separate univariate CPMs and stacked regression. Employing a simulation study and real-world example, we illustrate that CPMs for joint risk prediction of multiple outcomes should only be derived using methods that model the residual correlation between outcomes. In such a situation, our results suggest that probabilistic classification chains, multinomial logistic regression or the Bayesian probit model are all appropriate choices. We call into question the development of CPMs for each outcome in isolation when multiple correlated or structurally related outcomes are of interest and recommend more multivariate approaches to risk prediction.
Highlights
Clinical prediction models (CPMs) aim to predict the probability that clinically relevant outcomes are present or will occur in the future for an individual, given information known about them at the time of prediction.[1,2,3] CPMs are predominately derived in a multivariable regression framework, which combine estimated associations between multiple predictors and an outcome of interest.Generally, different CPMs are developed in isolation, where each model considers only a single outcome
The remainder of the paper is structured as follows: in Section 2 we outline notation and present current univariate approaches to developing CPMs; we provide an overview of several methods to develop prognostic CPMs for multiple binary outcomes in Section 3; in Section 4 we describe the design and results of a simulation study comparing the methods, while in Section 5 we apply the methods to a real-world critical care example; in Section 6 we discuss our findings and present directions for future work
Our results suggest that probabilistic classification chains, multinomial logistic regression or the multivariate probit model might be the most appropriate choice for developing multivariate CPMs for multiple binary outcomes
Summary
Clinical prediction models (CPMs) aim to predict the probability that clinically relevant outcomes are present (diagnostic prediction) or will occur in the future (prognostic prediction) for an individual, given information known about them at the time of prediction.[1,2,3] CPMs are predominately derived in a multivariable regression framework (eg, logistic regression for binary outcomes), which combine estimated associations between multiple predictors (risk or prognostic factors) and an outcome of interest.Generally, different CPMs are developed in isolation, where each model considers only a single outcome. There are many medical applications where two or more outcomes are of interest As such, this should be more widely reflected in CPMs so they can accurately estimate the joint risk of multiple outcomes simultaneously.[4,5] For example, clinical teams consider mortality, morbidity, and quality of life in their decision-making for performing cardiovascular surgery, but surgical risk models (which are widely considered integral to surgical practice) are usually developed to predict single outcomes.[6,7] Another motivating example is in predicting likely outcomes during and after pregnancy, which often requires a multivariate perspective.[8] As a final motivating example, individuals are increasingly developing multiple diseases over their lifetime (ie, multimorbidity), but the plethora of CPMs developed to predict risks of common noncommunicable diseases such as cardiovascular disease,[9] types of cancer,[10] and chronic kidney disease[11] are usually developed in isolation. For CPMs to assist in multimorbidity resource planning and management, one needs to be able to estimate the (joint) risk of different combinations of conditions co-occurring,[12,13] which is only possible from taking a multivariate approach to prediction
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.