Periprosthetic joint infection (PJI) is an uncommon, but serious complication in total joint arthroplasty. Personalized risk prediction and risk factor management may allow better preoperative assessment and improved outcomes. We evaluated different data-driven approaches to develop surgery-specific PJI prediction models using large-scale data from the electronic health records. A large institutional arthroplasty registry was leveraged to collect data from 58,574 procedures of 41,844 patients who underwent at least one primary and/or revision hip and/or knee arthroplasty between 2000 and 2019. The registry dataset was augmented with additional clinical, procedural, and laboratory data from the electronic health records for more than 100 potential predictor variables. The main outcome was PJI within the first year after surgery. We implemented both traditional and machine learning methods for model development (lasso regression, relaxed lasso regression, ridge regression, random forest, stepwise regression, extreme gradient boosting, neural network) and used 10-fold cross-validation to calculate measures of model performance in terms of discrimination (c-statistic), cross-entropy loss, and calibration. All models performed similarly in predicting PJI risk, with negligible differences of less than 0.08 between the best and worst-performing models. The relaxed and fully relaxed lasso models using the Cox model structure outperformed the other models with concordances of 0.787 in primary hip arthroplasty and 0.722 in revision hip arthroplasty, with the number of predictors ranging from nine to 41. The concordances with the relaxed lasso models were 0.681 in primary and 0.699 in revision knee arthroplasty, with a higher number of predictors in the models. Predictors included in the models varied substantially across the four surgical groups. The incorporation of additional data from the electronic health records offers limited improvement in PJI risk stratification. Furthermore, improvement in PJI risk prediction was modest with the machine learning approaches and may not justify the added complexity.
Read full abstract