Introduction High-grade B-cell lymphoma (HGBL) is an aggressive lymphoma that often harbors MYC rearrangements ( MYC-R) and molecular signatures attributed to aberrant MYC activation (Olszewski et al. Blood 2022). Identifying and classifying HGBL is challenging; current classification systems recognize diffuse large B cell lymphoma/HGBL with MYC and BCL2 rearrangements ( MYC-R/ BCL2-R; double-hit; defined molecularly) and HGBL-not otherwise specified (defined morphologically) in which MYC-R occur in up to 45% of cases and the double-hit signature (DHITsig) occurs in 54% of cases (Alaggio et al. Leukemia 2022, Campo et al. Blood 2022, Olszewski et al. Blood 2022). Existing methods for molecular classification, such as fluorescence in situ hybridization (FISH), are expensive, time-consuming and not widely available, and morphological classification is subjective and associated with high inter-reader variability (Natkunam et al. Histopathology 2023). Therefore, we applied a deep-learning approach to identify MYC-driven HGBL from whole-slide images (WSIs) of ubiquitously available hematoxylin and eosin-stained slides. Methods The model was trained on a real-world data set containing 600 WSIs from multiple sites using MYC-R status determined by FISH as a label. The model was evaluated on an independent test set comprised of 287 WSIs from the GOYA study (ClinicalTrials.gov identifier: NCT01287741). WSIs were scanned with a Roche Ventana DP200 scanner at 0.25 μm/px. Model architecture extracted cytological features from single cells and architectural features from larger tissue regions, enabling the quantification of high-grade morphology characterized by monomorphic sheets of dense cells with round, intermediately sized nuclei and finely dispersed chromatin. Model predictions were compared with MYC-R and BCL2-R with the immunoglobulin heavy chain locus (IGH), gene expression signatures (including DHITsig and cell of origin [COO]), gene mutation signatures obtained by next generation sequencing, computationally derived morphological measurements, and morphological evaluation by hematopathologists. MCD ( MYD88/CD79B mutations) and A53 ( TP53 mutations) genetic subtypes were determined by LymphGen. Statistical comparisons were made using Kendall's tau-b (τb) correlation for continuous variables and Fisher's exact test with conditional maximum likelihood estimated odds ratios (ORs) for categorical variables. Results In the test set, the model distinguished between cases with and without MYC-R (IGH) with an area under the receiver operating characteristic curve of 0.85, sensitivity of 0.95, specificity of 0.59, positive predictive value of 0.16, and negative predictive value of 0.99. The model correctly classified 21/22 MYC-R (IGH) cases and all four MYC-R/ BCL2-R (IGH) double-hit cases. Of DHITsig cases, the model classified 25/33 as MYC-driven HGBL. Model scores correlated with MYC expression (τb = 0.25; p < 0.0001) and Ki67 expression (τb = 0.15, p < 0.0001) ( Figure 1A). Tumors classified as MYC-driven HGBL were enriched for the non-germinal center B-cell-like COO (OR 2.0; p < 0.01) and the MCD genetic subtype (OR 1.9; p < 0.05), and depleted for the A53 genetic subtype (OR 0.3; p < 0.001). Additionally, enrichment was observed for MCD hallmark mutations in MYD88 (OR 1.9; p < 0.05), CD79b (OR 4.6; p < 0.001) and CDKN2A (OR 3.6; p < 0.0001) ( Figure 1A). Cell density (τb = 0.30; p < 0.0001), mean nuclei roundness (τb = 0.15; p < 0.001), coefficient of variation of nucleus roundness (τb = −0.19; p < 0.0001) and size (τb = −0.29; p < 0.0001), and mean nuclear pixel intensity (τb = −0.36; p < 0.0001) were associated with positive model predictions ( Figure 1A) corresponding to high-grade morphology. WSIs labeled as having high-grade cytomorphology by hematopathologists also had high model scores (p = 0.07; OR 3.5). Conclusions Our deep learning model, trained using MYC-R status, identified MYC-driven HGBL by enriching for cases with MYC-R, MYC overexpression, molecular gene expression profiles associated with aberrant MYC activation and cell proliferation, and high-grade morphology. The model also identified molecular characteristics other than MYC-R, such as MCD genetic subtype, associated with increased MYC expression (Schmitz et al. N Engl J Med 2018, Lacy et al. Blood 2020, Varano et al. Nature 2017), which converge on a common phenotype and high-grade morphology ( Figure 1B).
Read full abstract