Abstract Breast cancer, marked by uncontrolled cell growth in breast tissue, is the most common cancer among women and a second-leading cause of cancer-related deaths. Among its types, ductal and lobular carcinomas are the most prevalent, with invasive ductal carcinoma accounting for about 70–80% of cases and invasive lobular carcinoma for about 10–15%. Accurate identification is crucial for effective treatment but can be time-consuming and prone to interobserver variability. AI can rapidly analyze pathological images, providing precise, cost-effective identification, thus reducing the pathologists’ workload. This study utilizes a deep learning framework for advanced, automatic breast cancer detection and subtype identification. The framework comprises three key components: detecting cancerous patches, identifying cancer subtypes (ductal and lobular carcinoma), and predicting patient-level outcomes from whole slide images (WSI). The validation process includes visualization using Score-CAM to highlight cancer-affected areas prominently. Datasets include 111 WSIs (85 malignant from the Warwick HER2 dataset and 26 benign from pathologists). For subtype detection, there are 57 ductal and 8 lobular carcinoma cases. A total of 28,428 annotated patches were reviewed by two expert pathologists. Four pre-trained models—DenseNet-201, MobileNetV2, an ensemble of these two, and a Vision Transformer-based model—were fine-tuned and tested on the patches. Patient-level results were predicted using a majority voting technique based on the percentage of each patch type in the WSI. The Vision Transformer-based model outperformed other models in patch classification, achieving an accuracy of 96.74% for cancerous patch detection and 89.78% for cancer subtype classification. For WSI-based cancer classification, the majority voting method attained an F1-score of 99.06 and 96.13% for WSI-based cancer subtype classification. The proposed deep learning-based framework for advanced breast cancer detection and subtype identification yielded promising results. This advanced framework shows great promise in medical practice, offering an economical, efficient solution for generating accurate, clinically relevant results and enhancing diagnostic accuracy in hospitals, research centers, and pathology laboratories. Nonetheless, further studies are needed to validate its effectiveness across various environments and larger datasets.
Read full abstract