Abstract

Abstract Background: HER2 expression level is a key factor in determining the optimal treatment course for breast cancer patients. Roughly 15% of breast cancers are HER2(+), and determination of HER2 status is routinely assessed by immunohistochemistry (IHC). Accurate assessment of the HER2 IHC score (0, 1+, 2+, 3+) by pathologists is therefore critical, especially in light of novel therapeutic approaches demonstrating efficacy in the HER2-low setting (IHC scores 1+, and 2+/FISH-)1,2. To assist pathologists with the consistent provision of reproducible and accurate scores across the entire HER2 scoring range, we developed a machine-learning model (“AIM-HER2”) to generate accurate, slide-level HER2 scores aligned with ASCO-CAP guidelines in clinical breast cancer HER2 IHC specimens. Methods: AIM-HER2 was developed using whole-slide images (WSI; N=4261) from clinical and commercial sources. WSI were split into training (N=2694, 63%) and optimization (N=1567, 37%) sets. An additive multiple instance learning (aMIL) model3 was trained to predict HER2 scores directly from WSI and create interpretable heatmaps that depict HER2 predictions in tissue images. Image artifacts and in situ carcinomas were identified using previously trained artifact and tissue segmentation models and were excluded, leaving only regions of invasive carcinoma to be analyzed. AIM-HER2 performance was assessed on additional slides obtained from five academic or commercial sources (N=804 total, 770 evaluable) on which HER2 IHC was performed. Board-certified pathologists (N=52) with relevant experience provided manual HER2 scores based on ASCO-CAP guidelines. Nested pairwise non-inferiority analysis4 was used to compare model performance to that of pathologists (N=3 pathologists per slide). In the nested pairwise framework, agreement among pathologists was compared to agreement between AIM-HER2 and pathologists via linear kappa, so that summary metrics account for inter-pathologist variability. Results: High concordance was observed between AIM-HER2-predicted and pathologist-labeled slide-level HER2 scores, both overall and for each scoring level. Similar results were observed when assessing AIM-HER2 performance on multiple slide scanners and after IHC with multiple HER2 IHC antibody clones. Results are summarized in Table 1. Conclusions: We developed AIM-HER2, a novel aMIL-based approach for predicting slide-level HER2 IHC scores. AIM-HER2 has similar levels of agreement with pathologists as pathologists have with each other for determining HER2 score. This result is upheld when slides imaged using multiple scanning platforms and stained using multiple HER2 antibody clones. The performance of AIM-HER2 on multiple scanners and after multiple assays supports broad applicability of this algorithm in clinical laboratories, including for the identification of HER2-low cases. Work is ongoing to perform similar analyses in an independent, real-world dataset.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call