Abstract
For 2 decades, the American Society of Clinial Oncology-College of American Pathologists human epidermal growth factor receptor 2 (HER2) testing criteria have included 0 and 1+ scores, but this distinction was inconsequential. Now, based on the DESTINY Breast-04 Trial (DB-04) results, for patients with metastatic breast cancer it underpins eligibility for trastuzumab-deruxtecan treatment. Discerning 0 from 1+ immunohistochemistry (IHC) staining is challenging, as HER2 low is not a biologically distinct cancer subset, there are no reference standards or controls, and second-tier tests (eg, in situ hybridization) do not apply. Prior reports cast doubt on the reliability of pathologists' IHC scoring, with resulting treatment misalignments. With institutional review board approval, our group of 9 breast pathologists from 8 Australian laboratories had previously established HER2-low-focused scoring conventions, based on the American Society of Clinial Oncology-College of American Pathologists 2018 HER2 guidelines, and specifying common staining pitfalls. We reported the results of the first set of 60 breast cancers evaluated with these methods. After a 5-month washout, for the present validation study, we have compiled a second set of 64 HER2-negative invasive breast cancer core biopsies, all assessed with the Ventana 4B5 HER2 assay. We have each scored digitized images of HER2 IHC slides of the cases. Using the majority opinion as the target score, we have calculated our performance metrics. We have compared the results of our performance in set 1 and set 2 to assess the effectiveness of our approach and learning retention. The cases in this validation set included 40 (62.5%) HER2 low, 10 (17.2%) ultralow (UL), and 13 (18.8%) null cancers. Concordance was not achieved in 1 case. For distinguishing HER2 low from other cancers (UL and null combined) the mean values of our performance metrics were accuracy 89.58%, sensitivity 90.83%, specificity 87.50%, positive predictive value 95.63%, negative predictive value 83.59%, and Cohen kappa score 0.81. Comparing these results with our initial study, we have maintained our high level of performance across these parameters. Our mean kappa score is now in the excellent range for concordance. Maintaining high performance across a range of measures in 2 separate data sets validates the effectiveness of our HER2-low-focused scoring conventions. Having validated our approach, we will use these reference case sets with expert-level consensus scores for peer training and updating our national HER2 IHC external quality assurance program. In our ongoing studies, we are also assessing the performance of software algorithms to determine their suitability for the prescreening of HER2 IHC slides.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have