Abstract Study question Can the deep learning-based diagnostic algorithms of pelvic ultrasonography be a useful clinical tool for evaluating the malignancy risk of ovarian neoplasms? Summary answer AI-based deep learning diagnostic algorithm for pelvic ultrasonography is a potential clinical aid, as it exhibits excellent performance in the differential diagnosis of ovarian neoplasm. What is known already Transvaginal ultrasonography serves as a primary diagnostic tool for adnexal neoplasms due to its widespread availability and user-friendly nature within gynecologic clinics. There have been attempts to employ artificial intelligence algorithms for identifying ovarian malignancy through sonographic evaluations, yet the existing literature is limited, and the optimal processing method for constructing ultrasound-based AI images is not known. This study aims to address this gap by constructing a comprehensive AI-based model and comparing the various input methods for the development of the ultrasound-based diagnostic model. Study design, size, duration A retrospective cohort study was conducted on the clinical records of 1,171 patients diagnosed with ovarian neoplasms through pelvic ultrasonography from May 2002 to August 2021 at our institution. Inclusion criteria mandated that patients undergo pelvic ultrasonographic examination in our department, followed by surgical excision, with a requisite availability of pathologic diagnosis for the identified ovarian neoplasm. Participants/materials, setting, methods The dataset was divided into training, validation, and testing subsets with a distribution ratio of 8:1:1 (2373, 297, 298 images). Pre-processing of ultrasound input was done after removing text annotations and unnecessary non-image data. Deep learning methods with different architectures were utilized, including Swin Transformer, ConvNeXt, MobileNet, ResNet18, EfficientNet V2, and VGG16. Main results and the role of chance This performance analysis encompasses a comparison of various deep learning architectures applied to classification tasks within ultrasound imaging data. The metrics evaluated include Sensitivity, Specificity, Positive Predictive Value (PPV, also known as Precision), Negative Predictive Value (NPV), F1 Score, Accuracy, and the Area Under the Receiver Operating Characteristic curve (AUROC). For Sensitivity, the sequence MobileNet, ResNet-18, EfficientNet-V2, VGG-16, Swin Transformer, and ConvNext is 0.830, 0.851, 0.879, 0.835, 0.883, 0.877; Specificity values follow as 0.848, 0.840, 0.877, 0.866, 0.893, 0.906; PPV (Precision) is reported as 0.848, 0.840, 0.877, 0.866, 0.893, 0.906; NPV metrics are 0.830, 0.851, 0.877, 0.835, 0.879, 0.883; F1 Score progression is 0.838, 0.845, 0.878, 0.848, 0.888, 0.890; Accuracy rates are 0.865, 0.865, 0.895, 0.875, 0.906, 0.909; and finally, AUROC scores span 0.920, 0.920, 0.930, 0.920, 0.930, 0.960. These metrics collectively highlight the nuanced efficacy of each model in ultrasound image classification, with ConvNext demonstrating exceptional performance across the board. Limitations, reasons for caution An external validation of our model using datasets outside our institution has not been carried out yet. Also, the clinical validity of our model in real-life practice has not been investigated. These issues will be addressed subsequently through our continued investigation. Wider implications of the findings A deep-learning model utilizing ConvNeXt yielded excellent diagnostic performance in the differential diagnosis of ovarian neoplasms. The integration of AI-based deep learning models in clinical settings has the potential to significantly enhance the diagnostic accuracy of ultrasonography for ovarian neoplasms. Trial registration number not applicable
Read full abstract