BackgroundHospital length of stay (LoS) varies widely across hip (HA) and knee arthroplasty (KA) patients and depends on multiple factors. Prediction methods are necessary to improve hospital capacity planning and identify patients at risk of long LoS. This study aims (1) to compare the performance of previously applied machine learning (ML) as well as regression methods for either LoS classification or regression in a multi-hospital setting for primary HA and KA patients. In addition, the study aims (2a) to assess which variables are the most important predictors for LoS prediction and, specifically, (2b) whether patient-reported outcome measures (PROMs) collected before surgery act as important predictors. Methods2611 primary HA and 2077 primary KA patients from eight German hospitals were included to train and test extreme gradient boosting (XGBoost), naïve Bayes (NB) and logistic regression (LogReg) for classification, and XGBoost as well as a linear regression (LinReg) for regression. Area under the receiver operating characteristics curve (AUC) and mean absolute error (MAE) were used as primary performance indicators for classification and regression. ResultsFor classification, the highest AUC was reached by XGBoost and LogReg (AUC = 0.81) in the HA sample, whereas NB was statistically significantly outperformed by both other methods. In the KA sample, no statistical difference between any method was found, and AUC was lower for all models compared with HA. For regression, MAE was lowest for XGBoost (1.43 days for HA and 1.21 days for KA). PROMs and hospital indicators were among the most relevant predictors in all cases. ConclusionThe study demonstrated robust performance of ML in predicting LoS. PROMs reflect relevant features for prediction. They should be routinely collected and used for practical applications. XGBoost may act as a superior prediction tool compared to regression or other ML models in certain circumstances.
Read full abstract