Abstract Background Phenotypic characteristics and genetic content have been observed to vary within individual tumors. However, translational research and some clinical practices often rely on extrapolating single histopathology slides to be representative of whole-tumor biology, even when the extent of intra-tumor biological heterogeneity is unknown. In addition, technical factors in slide preparation and digitization can also contribute to variability. Here, we use digital pathology models to identify cell and tissue types in whole-slide images of breast cancer tumors, and quantify intra- and inter-case heterogeneity in cell and tissue features of the tumor microenvironment. Methods We developed PathExplore, a suite of convolutional neural network models trained on pathologist annotations of cell and tissue types in hematoxylin and eosin-stained slides. The models were deployed on the TCGA breast cancer dataset (n=1083 primary solid tumors). Quantitative human-interpretable features (HIFs) relating to cancer, stromal, and necrotic tissue areas, as well as the abundance and distribution of cancer cells, fibroblasts, and immune cells within these tissues, were extracted for each slide. Similarity between slides on sets of HIFs was assessed using Pearson correlation. Intra- and inter-case variability of individual HIFs was measured using the normalized percent difference (range divided by mean) for each slide pair. Case-specificity was quantified by the AUC of correlation or difference distributions for intra-case vs inter-case slide pairs. Only pairs of slides with matching metadata, including tissue source site, cancer stage, and scanner model, were considered for inter-case analysis. Results A total of 55 cases, comprising 114 slides, were identified as having more than one slide per case. Considering all 123 proportional HIFs, intra-case slide pairs showed high correlation compared to inter-case pairs (r=0.97 vs r=0.90, AUC=0.88), with 29% of slides coming from a multi-slide case most closely correlated with another slide from the same case out of the entire TCGA breast dataset. Subsets of these proportional HIFs relating to each identified cell type were also case-specific, with AUC’s ranging from 0.79 for macrophage HIFs to 0.87 for cancer cell HIFs. All proportion, density, and ratio HIFs (n=297) individually showed case specificity (AUC >0.5), with median intra-case differences ranging from 0.6% to 69%. Area proportion of necrosis was more variable than cancer or stroma tissue, while cancer cell count proportions were less variable than fibroblast or immune cell proportions. Cell count proportions were in general less variable and more specific than their corresponding density HIFs; for instance, the frequency of fibroblasts out of all cells in stroma had a median intra-case difference of 11% compared to 20% for the density of fibroblasts in stroma. Conclusions We quantified heterogeneity of key features of the tumor microenvironment both within and across tumors. These results reveal biological and technical variability that can inform selection and interpretation of biomarkers derived from single slides. The ability to uniquely identify slides from the same case additionally demonstrates the technical robustness of digital pathology models for yielding quantitative insights into tumor biology. Citation Format: Ylaine Gerardin, Christian Kirkup, Archit Khosla, Laura Chambre, Michael Drage, Amaro Taylor-Weiner. Digital pathology models reveal case-specific characteristics of the tumor microenvironment [abstract]. In: Proceedings of the 2023 San Antonio Breast Cancer Symposium; 2023 Dec 5-9; San Antonio, TX. Philadelphia (PA): AACR; Cancer Res 2024;84(9 Suppl):Abstract nr PO1-15-01.
Read full abstract