To evaluate the diagnostic performance of a deep learning model based on multi-modal images in identifying molecular subtype of breast cancer. A total of 158 breast cancer patients (170 lesions, median age, 50.8 ± 11.0 years), including 78 Luminal A subtype and 92 non-Luminal A subtype lesions, were retrospectively analyzed and divided into a training set (n = 100), test set (n = 45), and validation set (n = 25). Mammography (MG) and magnetic resonance imaging (MRI) images were used. Five single-mode models, i.e., MG, T2-weighted imaging (T2WI), diffusion weighting imaging (DWI), axial apparent dispersion coefficient (ADC), and dynamic contrast-enhanced MRI (DCE-MRI), were selected. The deep learning network ResNet50 was used as the basic feature extraction and classification network to construct the molecular subtype identification model. The receiver operating characteristic curve were used to evaluate the prediction efficiency of each model. The accuracy, sensitivity and specificity of a multi-modal tool for identifying Luminal A subtype were 0.711, 0.889, and 0.593, respectively, and the area under the curve (AUC) was 0.802 (95% CI, 0.657- 0.906); the accuracy, sensitivity, and AUC were higher than those of any single-modal model, but the specificity was slightly lower than that of DCE-MRI model. The AUC value of MG, T2WI, DWI, ADC, and DCE-MRI model was 0.593 (95%CI, 0.436-0.737), 0.700 (95%CI, 0.545-0.827), 0.564 (95%CI, 0.408-0.711), 0.679 (95%CI, 0.523-0.810), and 0.553 (95%CI, 0.398-0.702), respectively. The combination of deep learning and multi-modal imaging is of great significance for diagnosing breast cancer subtypes and selecting personalized treatment plans for doctors.