Benign and malignant myxoid soft tissue tumors have shared clinical, imaging, and histologic features that can make diagnosis challenging. The purpose of this study is comparison of the diagnostic performance of a radiomic based machine learning (ML) model to musculoskeletal radiologists. Manual segmentation of 90 myxoid soft tissue tumors (45 myxomas and 45 myxofibrosarcomas) was performed on axial T1, and T2FS or STIR magnetic resonance imaging sequences. Eighty-seven radiomic features from each modality were extracted. Five ML models were trained to classify tumors as benign or malignant in 40 tumors and then tested with an additional 50 tumors using cross validation. The accuracy of the best ML model based on area under the receiver operating characteristic curve (AUC) was compared to the consensus diagnosis of three musculoskeletal radiologists. Correlation between radiologist confidence (equivocal, probably, consistent with) and accuracy was tested. The best ML classifier was a logistic regression model (AUC 0.792). Using T1 + T2/STIR images, the ML model classified 78% (39/50) of tumors correctly at a similar rate compared to 74% (37/50) by radiologists. When radiologists disagreed, the consensus diagnosis classified 50% of tumors (7/14) correctly compared to 86% (12/14) by the ML model, though this did not reach statistical significance. Radiologists had a cumulative accuracy of 91% (30/33) when they rated their confidence 'consistent with' compared to 61% (31/51) when they rated their confidence 'equivocal/probably' (P = 0.006). For cases when radiologists rated their confidence 'equivocal/probably', the ML model had 76% accuracy (39/51). A radiomic based ML model predicted benign or malignant diagnosis in myxoid soft tissue tumors similarly to the consensus diagnosis by three musculoskeletal radiologists. Radiologist confidence in the diagnosis strongly correlated with their diagnostic accuracy. Though radiomics and radiologists perform similarly overall, radiomics may provide novel diagnostic utility when radiologist confidence is low, or when radiologists disagree.
Read full abstract