Abstract Background: Tissue Microarrays (TMAs) stand to significantly improve the efficiency of multiplex spatial studies by including core samples taken from many tumor blocks in a single slide. However, as TMAs become increasingly prevalent, the implication of core selection within the tumor microenvironment (TME) on downstream discoveries is not well understood. This raises the question of which TME contexts are most predictive for scientific analyses. Methods: We segment whole tissue section (WTS) slides with a U-Net trained on pathologist annotations of tumor, DCIS, LCIS, normal, stromal, immune, and necrotic microenvironment contexts in 100 TCGA BRCA slides. We focus on 5-way PAM50 subtype prediction in breast cancer. We produce image embeddings for each core using a vision transformer pre-trained on histopathology images and use an attention-based aggregation model to predict PAM50 subtype. To evaluate the impact of core TME contexts in a held-out test set, we sample 0.6mm radius circular sections from 281 WTS images from 54 patients to produce one dataset of “majority non-cancerous” cores that the TME annotation model predicts as majority normal breast epithelial or necrotic tissue and one “majority cancerous” TMA dataset for the remaining cores. Results: We find that core microenvironment context directly influences diagnostic accuracy. Maximizing tumor content improves diagnostic accuracy on the PAM50 task, with the majority cancerous TMAs obtaining an AUROC of 0.80 (95% confidence interval 0.75-0.85), relative to 0.72 (95% CI 0.66-0.78) for majority non-cancerous TMAs and 0.82 (95% CI 0.80-0.84) for WTS images. Notably, our context-aware TMA sampling achieves accuracy within 2% of the WTS images, while poorly selected TMA cores underperform the WTS baseline by 10% on average. While TCGA is curated to contain patients with a very high tumor burden, and consequently random TMA selection performs well on this cohort (AUROC 0.77; 95% CI 0.74-0.79), our study indicates that tissue context is an important consideration when selecting cores from real donor tissue blocks. Our results demonstrate that TMA core sampling protocols should be informed by TME context to maximize analysis accuracy, and that intelligent core sampling can largely close the gap to WTS image accuracies on clinically-relevant tasks like PAM50 subtype prediction. These findings highlight the need for thoughtful TMA dataset construction and usage, especially as core selection becomes increasingly automated. We argue both for more targeted core selection from donor tissue blocks and for filtering of existing datasets. To avoid performance degradation, pipelines should identify and exclude these less task-relevant TMEs, where simulation serves as an efficient and task-specific approximation of TME context relevance. Citation Format: Addie Woicik, Zachary R. McCaw, Ben Dulken, insitro Research Team, Christopher Probert. Exploring the role of microenvironment context on diagnostic accuracy in tissue microarray core selection [abstract]. In: Proceedings of the American Association for Cancer Research Annual Meeting 2024; Part 1 (Regular Abstracts); 2024 Apr 5-10; San Diego, CA. Philadelphia (PA): AACR; Cancer Res 2024;84(6_Suppl):Abstract nr 6183.
Read full abstract