The primary aim was to identify radiomics ultrasound features that can distinguish between benign and malignant adnexal masses with solid ultrasound morphology, and between primary malignant (including borderline and primary invasive) and metastatic solid ovarian masses, and to develop ultrasound-based machine learning models that include radiomics features to discriminate between benign and malignant solid adnexal masses. The secondary aim was to compare the discrimination performance of our newly developed radiomics models with that of the Assessment of Different NEoplasias in the adneXa (ADNEX) model and that of subjective assessment by an experienced ultrasound examiner. This was a retrospective, observational single-center study conducted at Fondazione Policlinico Universitario A. Gemelli IRCC, in Rome, Italy. Included were patients with a histological diagnosis of an adnexal tumor with solid morphology according to International Ovarian Tumor Analysis (IOTA) terminology at preoperative ultrasound examination performed in 2014-2020, who were managed with surgery. The patient cohort was split randomly into training and validation sets at a ratio of 70:30 and with the same proportion of benign and malignant tumors in the two subsets, with malignant tumors including borderline, primary invasive and metastatic tumors. We extracted 68 radiomics features, belonging to two different families: intensity-based statistical features and textural features. Models to predict malignancy were built based on a random forest classifier, fine-tuned using 5-fold cross-validation over the training set, and tested on the held-out validation set. The variables used in model-building were patient age and radiomics features that were statistically significantly different between benign and malignant adnexal masses and assessed as not redundant based on the Pearson correlation coefficient. We evaluated the discriminative ability of the models and compared it to that of the ADNEX model and that of subjective assessment by an experienced ultrasound examiner using the area under the receiver-operating-characteristics curve (AUC) and classification performance by calculating sensitivity and specificity. In total, 326 patients were included and 775 preoperative ultrasound images were analyzed. Of the 68 radiomics features extracted, 52 differed statistically significantly between benign and malignant tumors in the training set, and 18 uncorrelated features were selected for inclusion in model-building. The same 52 radiomics features differed significantly between benign, primary malignant and metastatic tumors. However, thevalues of the features manifested overlapped between primary malignant and metastatic tumors and did not differ significantly between them. In the validation set, 25/98 tumors (25.5%) were benign and 73/98 (74.5%) were malignant (6 borderline, 57 primary invasive, 10 metastatic). In the validation set, a model including only radiomics features had an AUC of 0.80, sensitivity of 0.78 and specificity of 0.76 at an optimal cut-off for risk of malignancy of 68%, based on Youden's index. The corresponding results for a model including age and radiomics features were AUC of 0.79, sensitivity of 0.86 and specificity of 0.56 (cut-off 60%, based on Youden's index), while those of the ADNEX model were AUC of 0.88, sensitivity of 0.99 and specificity of 0.64 (at a 20% risk-of-malignancy cut-off). Subjective assessment had a sensitivity of 0.99 and specificity of 0.72. Our radiomics model had moderate discriminative ability on internal validation and the addition of age to this model did not improve its performance. Even though our radiomics models had discriminative ability inferior to that of the ADNEX model, our results are sufficiently promising to justify continued development of radiomics analysis of ultrasound images of adnexal masses. © 2024 The Author(s). Ultrasound in Obstetrics & Gynecology published by John Wiley & Sons Ltd on behalf of International Society of Ultrasound in Obstetrics and Gynecology.
Read full abstract