Abstract Background: Immunohistochemical evaluation of HER2 status, the hormone receptors ER and PR, and the proliferation marker Ki67 forms part of the routine clinical diagnostic pathway for invasive breast carcinomas, and is the cornerstone of treatment stratification, informing both prognosis and patient management. Pathologist scoring of immunohistochemistry (IHC) at the microscope is time-consuming and prone to significant inter- and intra-observer variability. We developed HALO Breast AI, a decision-support system designed to improve efficiency and diagnostic accuracy through automating whole slide image (WSI) scoring. Here, we present preliminary results of a validation study of HALO Breast AI. Methods: HALO Breast AI was developed with routine diagnostic cases sourced from three institutes. The algorithm was trained using 107,755 pathologist-reviewed annotations to identify and threshold DAB-positive tumor cells within automatically segmented tumor regions. Internal validation was conducted on 80 unseen WSI, using 60,012 pathologist-reviewed annotations to assess analytical performance. Comparison of the algorithm scores to the mode of 3 expert pathologists (where at least 2 out of 3 agreed) was used to assess consensus agreement. Clinical performance and generalizability were assessed by comparing the algorithm scores to clinical data from two independent external institutes across 200 unseen WSI (n=50 per marker) from institute one and 300 unseen WSI (n=100 per marker [ ER, PR & HER2, only]) from institute 2. Results: The median image F1-score for tumor classification was 0.91, while the median image F1-score for cell level validation was 0.96. Internal validation showed agreement between HALO Breast AI and the mode of 3 expert pathologists of 100% for ER, 90% for PR, 95% for Ki67 and 90% for HER2. High concordance was measured between the algorithm and the pathologists’ scores, with Light’s kappa of 0.91 for ER, 0.85 for PR, 0.79 for HER2 and 0.82 for Ki67. Performance on WSI obtained from external institute one, showed agreement between the clinical score obtained from HALO Breast AI and the clinical data of 96% for ER, 94% for PR, 84% for HER2 and 84% for Ki67. External institute two showed agreement of 90% for ER, 90% for PR and 83% for HER2. Of the ER & PR cases that were in disagreement at the 1% clinical cut-off, the algorithm percent-positive scores were within a 1-3% range in 5 out of the 5 from institute 1 and 14 out of the 20 from institute 2. So, although on a categorical scale the AI assigned category disagreed with the clinical category, the results were close on a continuous scale. Conclusions: HALO Breast AI accurately detects tumor regions and tumor cells within breast cancer tissue and demonstrates high clinical agreement when scoring routine diagnostic IHC. Additionally, HALO Breast AI shows good generalisability, with a consistently strong performance across inherent variability that exists between external, independent data sets. Computer-aided diagnostic tools such as HALO Breast AI have the potential to support pathologists in the diagnostic setting by improving workflow efficiency and standardising results. Citation Format: Meredith Lodge, Alastair Ironside, Ashley Graham, Antonio Polonia, Stefan Reinhard, Wiebke Solass, Inti Zlobec, Peter Caie. Development & Validation of an AI-supported Workflow for Clinical Scoring of HER2, ER, PR & Ki67 Immunohistochemistry in Breast Cancer Tissue [abstract]. In: Proceedings of the 2023 San Antonio Breast Cancer Symposium; 2023 Dec 5-9; San Antonio, TX. Philadelphia (PA): AACR; Cancer Res 2024;84(9 Suppl):Abstract nr PO3-07-03.
Read full abstract