ObjectiveTo explore the application values of deep-learning based artificial intelligence (AI) automatic classification system, on the differential diagnosis of non-lactating mastitis (NLM) and malignant breast tumors, via its comparation with traditional ultrasound interpretations and the following interpretation conclusions made by the sonographers with various seniorities.MethodsA total of 707 patients suffering from breast lesions (475 malignant breast tumors and 232 NLM), were selected from the following three medical centers, including Zhejiang Cancer Hospital, Hebei Province Hospital of Traditional Chinese Medicine, and Yantai Affiliated Hospital of Binzhou Medical University, and the time period was set from April 2020 to September 2021. All selected cases firstly accepted the routine breast ultrasound diagnosis, followed by the interpretations from a senior sonographer with more than 15 years of work experience, and an intermediate-aged sonographer with more than 5 years of work experience, independently. Meanwhile, a third physician also interpreted the same ultrasound images by deep learning–based AI automatic classification system, independent of the interpretation results from the previous two physicians. The kappa test was performed to evaluate the consistency between the conventional ultrasound interpretation results and pathological results interpreted from physicians with different working experiences.ResultsIn total, 475 cases of malignant breast tumors (512 nodules) and 232 cases of NLM (255 nodules) were pathologically diagnosed. The accuracy, sensitivity, and specificity of conventional ultrasound interpretations vary from different sonographers with different working experiences. The accuracy, sensitivity, and specificity for intermediate-aged sonographers and senior sonographers were 76.92% (590/767), 84.71% (216/255), and 73.95% (374/512) and 87.35% (670/767), 86.27% (220/255), and 87.89% (450/512), respectively (P<0.001). In contrast, if the threshold was set as 0.5, the accuracy, sensitivity, and specificity from deep learning–based AI automatic classification system were 83.00%, 87.20%, and 85.33%, separately, and the area under the curve was 92.6. The results of the kappa consistency test indicated that the diagnosis results from the image interpretations by senior physicians and deep-learning based AI automatic classification system showed high consistency with postoperative pathological diagnosis results, and the kappa values are 0.72 and 0.71, respectively, with the P-value of less than 0.001. In contrast, the consistency between the image interpretation results from intermediate-aged physicians with less working experience, and postoperative pathological diagnosis results, seemed to be relatively lower, with a kappa value of only 0.53 and P-value of less than 0.001.ConclusionsThe deep learning–based AI automatic classification system is expected to become a reliable auxiliary way to distinguish NLM and malignant breast tumors due to its high sensitivity, accuracy, and specificity.
Read full abstract