This study aims to explore the accuracy of Convolutional Neural Network (CNN) models in predicting malignancy in Dynamic Contrast-Enhanced Breast Magnetic Resonance Imaging (DCE-BMRI). A total of 273 benign lesions (benign group) and 274 malignant lesions (malignant group) were collected and randomly divided into a training set (246 benign and 245 malignant lesions) and a testing set (28 benign and 28 malignant lesions) in a 9:1 ratio. An additional 53 lesions from 53 patients were designated as the validation set. Five models-VGG16, VGG19, DenseNet201, ResNet50, and MobileNetV2-were evaluated. Model performance was assessed using accuracy (Ac) in the training and testing sets, and precision (Pr), recall (Rc), F1 score (F1), and area under the receiver operating characteristic curve (AUC) in the validation set. The accuracy of VGG19 on the test set (0.96) is higher than that of VGG16 (0.91), DenseNet201 (0.91), ResNet50 (0.67), and MobileNetV2 (0.88). For the validation set, VGG19 achieved higher performance metrics (Pr 0.75, Rc 0.76, F1 0.73, AUC 0.76) compared to the other models, specifically VGG16 (Pr 0.73, Rc 0.75, F1 0.70, AUC 0.73), DenseNet201 (Pr 0.71, Rc 0.74, F1 0.69, AUC 0.71), ResNet50 (Pr 0.65, Rc 0.68, F1 0.60, AUC 0.65), and MobileNetV2 (Pr 0.73, Rc 0.75, F1 0.71, AUC 0.73). S4 model achieved higher performance metrics (Pr 0.89, Rc 0.88, F1 0.87, AUC 0.89) compared to the other four fine-tuned models, specifically S1 (Pr 0.75, Rc 0.76, F1 0.74, AUC 0.75), S2 (Pr 0.77, Rc 0.79, F1 0.75, AUC 0.77), S3 (Pr 0.76, Rc 0.76, F1 0.73, AUC 0.75), and S5 (Pr 0.77, Rc 0.79, F1 0.75, AUC 0.77). Additionally, S4 model showed the lowest loss value in the testing set. Notably, the AUC of S4 for BI-RADS 3 was 0.90 and for BI-RADS 4 was 0.86, both significantly higher than the 0.65 AUC for BI-RADS 5. The S4 model we propose has demonstrated superior performance in predicting the likelihood of malignancy in DCE-BMRI, making it a promising candidate for clinical application in patients with breast diseases. However, further validation is essential, highlighting the need for additional data to confirm its efficacy.
Read full abstract