Repair of rotator cuff tear is not always feasible, depending on the severity. Although several studies have investigated factors related to reparability and various methods to predict it, inconsistent scoring methods and a lack of validation have hindered the utility of these methods. To develop machine learning models to predict the reparability of rotator cuff tears, compare them with previous scoring systems, and provide an accessible online model. Cohort study; Level of evidence, 3. Arthroscopic rotator cuff repairs for tears with both anteroposterior and mediolateral diameters >1 cm on preoperative magnetic resonance imaging were included and divided into a training set (70%) and an internal validation set (30%). For external validation, rotator cuff repairs performed by 2 different surgeons were included in a test set. Machine learning models and a newly adjusted scoring system were developed using the training set. The performance of the models including the adjusted scoring system and 2 previous scoring systems were compared using the test set. The performance was assessed using metrics such as the area under the receiver operating characteristic curve (AUROC) and compared using the net reclassification improvement based on the adjusted scoring system. A total of 429 patients were included for the training and internal validation set, and 112 patients were included for the test set. An elastic-net logistic regression demonstrated the best performance, with an AUROC of 0.847 and net reclassification improvement of 0.071, compared with the adjusted scoring system in the test set. The AUROC of the adjusted scoring system was 0.786, and the AUROCs of the previous scoring systems were 0.757 and 0.687. The elastic-net logistic regression was transformed into an accessible online model. The performance of the machine learning model, which provides a probability estimation for rotator cuff reparability, is comparable with that of the adjusted scoring system. Nevertheless, when deploying prediction models beyond the original cohort, regardless of whether they rely on machine learning or scoring systems, clinicians should exercise caution and not rely solely on the output of the model.
Read full abstract