Abstract

BACKGROUND CONTEXTWith an increasing number of web-based calculators designed to provide the probabilities of an individual achieving improvement after lumbar spine surgery, there is a need to determine the accuracy of these models. PURPOSETo perform an internal and external validation study of the reduced Quality Outcomes Database web-based Calculator (QOD-Calc). STUDY DESIGNObservational longitudinal cohort. PATIENT SAMPLEPatients enrolled study-wide in Quality Outcomes Database (QOD) and patients enrolled in DaneSpine at a single institution who had elective lumbar spine surgery with baseline data to complete QOD-Calc and 12-month postoperative data. OUTCOME MEASURESOswestry Disability Index (ODI), Numeric Rating Scales (NRS) for back and leg pain, EuroQOL-5D (EQ-5D). METHODSBaseline data elements were entered into QOD-Calc to determine the probability for each patient having Any Improvement and 30% Improvement in NRS leg pain, back pain, EQ-5D and ODI. These probabilities were compared with the actual 12-month postop data for each of the QOD and DaneSpine cases. Receiver-operating characteristics analyses were performed and calibration plots created to assess model performance. RESULTS24,755 QOD cases and 8,105 DaneSpine lumbar cases were included in the analysis. QOD-Calc had acceptable to outstanding ability (AUC: 0.694–0.874) to predict Any Improvement in the QOD cohort and moderate to acceptable ability (AUC: 0.658–0.747) to predict 30% Improvement. QOD-Calc had acceptable to exceptional ability (AUC: 0.669–0.734) to predict Any improvement and moderate to exceptional ability (AUC: 0.619–0.862) to predict 30% Improvement in the DaneSpine cohort. AUCs for the DaneSpine cohort was consistently lower that the AUCs for the QOD validation cohort. CONCLUSIONQOD-Calc performs well in predicting outcomes in a patient population that is similar to the patients that was used to develop it. Although still acceptable, model performance was slightly worse in a distinct population, despite the fact that the sample was more homogenous. Model performance may also be attributed to the low discrimination threshold, with close to 90% of cases reporting Any Improvement in outcome. Prediction models may need to be developed that are highly specific to the characteristics of the population.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call