Background Diffuse large B cell lymphoma (DLBCL) is the most common aggressive lymphoid malignancy in adults. Though standard immunochemotherapy regimens can result in clinical remission and cure in a majority of patients, approximately 30% of patients are primary treatment resistant or eventually relapse (rrDLBCL), and their prognosis is very poor. Recent molecular classification by large scale genomic and transcriptomic profiling of newly diagnosed DLBCL tumors has resulted in a deeper understanding of disease drivers, however, the molecular heterogeneity of rrDLBCL is yet to be fully explored. Therefore, in this study, we used an unsupervised approach to better define transcriptional and genetic programs in rrDLBCL samples hypothesizing that molecular stratification within this high-risk group may lead to a better understanding of this disease, which remains an unmet need, and a more personalized therapeutic approach. Methods Data used in this study included RNA (n=143) and whole exome (n=126) sequencing data from available FFPE tumor samples at the time of a relapse (any line of treatment, r1-r10 relapse timepoints included in analysis, one per patient), consented to the Molecular Epidemiology Resource (n=61), banked in the Mayo Lymphoma Biobank (n=50), or consented to the CC-122-ST-001 clinical trial (n=32, NCT01421524). Unsupervised clustering was performed on protein coding gene expression values using the non-negative matrix factorization (NMF) approach. NMF was carried out by performing 200 runs on cluster sizes from 2 to 6. Cell of origin (COO) was determined using the method described by Reddy et al. The tumor microenvironment was analyzed using CIBERSORTx and Lymphoma Microenvironment Classification (LME). Genetic classification was done using LymphGen. Results In order to identify unique transcriptional programs in rrDLBCL, we implemented NMF and consensus clustering to define an ideal quantity of stable patient clusters. We identified 4 rrDLBCL patient clusters (rrC1-rrC4) as the most stable solution, based on consensus clustering selection criteria of iterative NMF runs. While relapse timepoint was not associated with individual clusters, we did observe significant differences in tumor characteristics between the clusters. COO classification found that rrC1 and rrC4 were enriched for GCB-DLBCL while rrC3 was enriched for ABC-DLBCL classification (P < .001). To explore the biologic programs in each cluster, we used pathFindR, which leverages protein-protein interaction networks to identify disease related pathways, rrC1 and rrC4 were defined by pathways involved in cell cycle and mismatch repair; oxidative phosphorylation in rrC2; and MAPK and NF-kB in rrC3. Next, LME classification was done to define patterns in the tumor microenvironment and we found a significant difference between groups (P < .001), with rrC2 and rrC4 tumors enriched for a depleted TME, while rrC1 and rrC3 had a more immune rich or inflammatory TME. This was supported by CIBERSORTx analysis identifying significant differences (P > .001) in abundance of CD8 T cells, memory CD4 T cells, and M1 macrophages between clusters, with rrC4 having the lowest abundance. We then explored genetic patterns in each cluster. Gene level mutation enrichment analysis showed BCL7A mutation exclusively in C4 (6/6, P = .03). The clusters were not associated with LymphGen classification. Lastly, our recently published high risk signature classification that is associated with early clinical failure and is characterized by metabolic dysregulation, a depleted TME, T53 alterations, and poor outcome, was significantly different between the groups (P = .03) with enrichment for high-risk cases in rrC4. In summary, we show for the first time that rrDLBCL patients can be classified into four gene expression clusters that are associated with distinct pathway, TME, and genetic programs. These clusters should now be tested to learn if they can help select patients for newer therapies for rrDLBCL such as CAR-T and bispecific antibodies.
Read full abstract