<h3>Purpose/Objective(s)</h3> State-of-the-art prediction modeling relies on machine learning (ML). To provide accurate out-of-sample predictions, such as when the number of available covariates is large, these models trade-off variance by adding bias. However, these artificial biases frequently create anomalies that are clinically inconsistent, hindering their use in clinical practice. We illustrate such biases in outcome prediction ML models using the National Cancer Data Base (NCDB) and propose a solution that leverages clinical expertise. We hypothesize that integrating clinical expertise into outcome prediction ML models will correct significant problems present in ML modeling without compromising the model accuracy, enabling models to be more clinically intuitive. <h3>Materials/Methods</h3> We use data from 92,990 prostate cancer patients in the NCDB (death rate: 28%) to build a gradient boosted machine (GBM) model to predict 10-year survival post-treatment. We then investigate how the model output depends on each covariate and identify inconsistencies with clinical knowledge. To correct for these biases, we collect clinical intuition from five oncologists through a survey that asks how increasing a covariate impacts (increase/decrease/variable effect) a patient's survival. We then encode the survey results as clinically informed monotonic constraints into a LASSO model and compare the AUC with and without constraints. <h3>Results</h3> We find that while the GBM model has a relatively high out-of-sample AUC of 0.79 (compared to similar studies, with AUC between 0.65-0.83, although patient cohort, methodology, and outcomes may differ), the way some covariates are used is not clinically consistent. For example, keeping other covariates equal, the probability of survival should consistently be higher for patients with lower prostate specific antigen (PSA) levels compared to patients at higher levels. But the model predicts (with high confidence) a sporadic, non-monotonic pattern. This is because the GBM model biases its predictions for patients that comprise a small fraction of the patient cohort (i.e., patients with very low PSA values) to avoid high variance, but the artificially injected bias is towards lower survival which is not clinically consistent. Covariates named by oncologists to have a monotonic effect on the probability of survival include PSA and age. We find that by adding clinically informed constraints to a LASSO model, the problem of clinical inconsistency is addressed without compromising model performance (AUC is 0.77 with and without constraints). <h3>Conclusion</h3> We illustrate that even in a large-scale regime, state-of-the-art clinical risk prediction models can suffer from biases. Our work shows that innovative integration of clinical intuition can make models clinically consistent and verifies that this method is promising research to resolve critical challenges in ML-based decision support.
Read full abstract