Introduction: Clinical and pathological heterogeneity in aggressive B-cell non-Hodgkin lymphomas may be driven by several mechanisms of oncogenesis, such as translocations involving the immunoglobulin locus and proto-oncogenes, the ‘cell-of-origin’ from the germinal center (GC), and viral associations. Previous classifications of B-cell lymphomas have been defined by one or more of these characteristics and transforming events. However, transposable elements (TEs), which make up roughly half of the human genome and have key regulatory functions, have been previously overlooked in both the healthy GC and in B-cell lymphomas. We hypothesized that unique TE signatures, which may in part be driven by human endogenous retroviruses (HERVs), would allow a novel categorization of B-cell lymphomas. Here we present a comprehensive, locus-specific atlas of TE expression in healthy GC B cells, diffuse large B-cell lymphoma (DLBCL), Epstein-Barr Virus (EBV)-positive and negative Burkitt lymphoma (BL), and follicular lymphoma (FL). Methods: We obtained RNA-seq data from 529 DLBCLs belonging to the TCGA and NCICCR cohorts, 113 sporadic and endemic BL, and 12 FL from the CGCI cohort for gene and locus-specific TE quantification. RNA-seq data from sorted cells from the healthy GC were obtained from two recent studies (Holmes et al., 2020; Agirre et al., 2019). TEs were quantified with Telescope, which addresses the computational challenges of repetitive and interspersed TE reads, and low read counts using an expectation-maximization algorithm. We used STAR for indexing and alignment, HTSeq for quantifying EBV viral reads in endemic BL samples, DESeq2 to identify differentially expressed TEs, and LRT, Boruta, and Lasso for feature selection. Results: TE-driven clustering of healthy B cells showed that HERVs could independently distinguish stages of B cell differentiation and specific GC subsets, with plasmablasts and bone marrow plasma cells having the highest number of differentially expressed HERVs. We found HERV loci of interest upstream of POU5F1B and MYC that can differentiate plasmablasts from other GC cells. Furthermore, we describe a map of locus-specific TE expression in DLBCL, BL, and FL. Strikingly, we find that BL can be subdivided into three HERV-driven clusters, which are not obtained with gene-only clustering, and identify 4 HERV loci sufficient to distinguish between clusters. Clusters are independent of EBV status, clinical variant, location of MYC translocation, and display a high number of differentially expressed LINE elements, lncRNAs, and snoRNAs. Conclusions: Here, we report an atlas of TE expression in BL, DLBCL, and FL, and provide evidence for disease-specific changes in TE expression in B-cell lymphomas. TEs that are selectively expressed in lymphoma subtypes provide opportunities for novel subclassification that may have biological and pathological relevance. The research was funded by: The work was supported in part by US National Institutes of Health (NIH) grant CA260691 (DFN), and UM1AI164559 (DFN). MLB is supported in part by the Department of Medicine, Fund for the Future program at Weill Cornell Medicine sponsored by the Elsa Miller Foundation. J.L.M. was supported in part by a Medical Scientist Training Program grant to the Weill Cornell–Rockefeller–Sloan Kettering Tri-Institutional MD-PhD Program (T32GM007739). Keywords: Aggressive B-cell non-Hodgkin lymphoma, Bioinformatics, Computational and Systems Biology, Pathology and Classification of Lymphomas No conflicts of interests pertinent to the abstract.
Read full abstract