BackgroundMolecular profiling of estrogen receptor (ER), progesterone receptor (PR), and ERBB2 (also known as Her2) is essential for breast cancer diagnosis and treatment planning. Nevertheless, current methods rely on the qualitative interpretation of immunohistochemistry and fluorescence in situ hybridization (FISH), which can be costly, time-consuming, and inconsistent. Here we explore the clinical utility of predicting receptor status from digitized hematoxylin and eosin-stained (H&E) slides using machine learning trained and evaluated on a multi-institutional dataset.MethodsWe developed a deep learning system to predict ER, PR, and ERBB2 statuses from digitized H&E slides and evaluated its utility in three clinical applications: identifying hormone receptor-positive patients, serving as a second-read tool for quality assurance, and addressing intratumor heterogeneity. For development and validation, we collected 19,845 slides from 7,950 patients across six independent cohorts representative of diverse clinical settings.ResultsHere we show that the system identifies 30.5% of patients as hormone receptor-positive, achieving a specificity of 0.9982 and a positive predictive value of 0.9992, demonstrating its ability to determine eligibility for hormone therapy without immunohistochemistry. By restaining and reassessing samples flagged as potential false negatives, we discover 31 cases of misdiagnosed ER, PR, and ERBB2 statuses.ConclusionsThese findings demonstrate the utility of the system in diverse clinical settings and its potential to improve breast cancer diagnosis. Given the substantial focus of current guidelines on reducing false negative diagnoses, this study supports the integration of H&E-based machine learning tools into workflows for quality assurance.
Read full abstract