IntroductionImmune dysregulation plays a major role in cancer progression. The quantification of lymphocytic spatial inflammation may enable spatial system biology, improve understanding of therapeutic resistance, and contribute to prognostic imaging biomarkers.MethodsIn this paper, we propose a knowledge-guided deep learning framework to measure the lymphocytic spatial architecture on human H&E tissue, where the fidelity of training labels is maximized through single-cell resolution image registration of H&E to IHC. We demonstrate that such an approach enables pixel-perfect ground-truth labeling of lymphocytes on H&E as measured by IHC. We then experimentally validate our technique in a genetically engineered, immune-compromised Rag2 mouse model, where Rag2 knockout mice lacking mature lymphocytes are used as a negative experimental control. Such experimental validation moves beyond the classical statistical testing of deep learning models and demonstrates feasibility of more rigorous validation strategies that integrate computational science and basic science.ResultsUsing our developed approach, we automatically annotated more than 111,000 human nuclei (45,611 CD3/CD20 positive lymphocytes) on H&E images to develop our model, which achieved an AUC of 0.78 and 0.71 on internal hold-out testing data and external testing on an independent dataset, respectively. As a measure of the global spatial architecture of the lymphocytic microenvironment, the average structural similarity between predicted lymphocytic density maps and ground truth lymphocytic density maps was 0.86 ± 0.06 on testing data. On experimental mouse model validation, we measured a lymphocytic density of 96.5 ± %1% in a Rag2+/- control mouse, compared to an average of 16.2 ± %5% in Rag2-/- immune knockout mice (p<0.0001, ANOVA-test).DiscussionThese results demonstrate that CD3/CD20 positive lymphocytes can be accurately detected and characterized on H&E by deep learning and generalized across species. Collectively, these data suggest that our understanding of complex biological systems may benefit from computationally-derived spatial analysis, as well as integration of computational science and basic science.