Abstract

Profound global loss of DNA methylation is a hallmark of many cancers. One potential consequence of this is the reactivation of transposable elements (TEs) which could stimulate the immune system via cell-intrinsic antiviral responses. Here, we develop REdiscoverTE, a computational method for quantifying genome-wide TE expression in RNA sequencing data. Using The Cancer Genome Atlas database, we observe increased expression of over 400 TE subfamilies, of which 262 appear to result from a proximal loss of DNA methylation. The most recurrent TEs are among the evolutionarily youngest in the genome, predominantly expressed from intergenic loci, and associated with antiviral or DNA damage responses. Treatment of glioblastoma cells with a demethylation agent results in both increased TE expression and de novo presentation of TE-derived peptides on MHC class I molecules. Therapeutic reactivation of tumor-specific TEs may synergize with immunotherapy by inducing inflammation and the display of potentially immunogenic neoantigens.

Highlights

  • Profound global loss of DNA methylation is a hallmark of many cancers

  • REdiscoverTE was devised to simultaneously quantify expression by all annotated genes defined in Gencode[13] and all RepeatMasker sequences in the human genome[14] (Fig. 1a, Supplementary Table 1, detailed in the Methods)

  • Post-quantification, we restricted our downstream analysis to only transposable elements (TEs) sequences, which are classified into 1052 distinct TE subfamilies in five classes: long interspersed nuclear elements (LINE), short interspersed nuclear elements (SINE), long terminal repeats (LTR), SVA, and DNA transposons (Supplementary Fig. 1a)

Read more

Summary

Results

Across the two data sets, 10 cancer types showed a significantly higher proportion of reads mapping to TEs in tumor compared with matched normal tissues, suggesting active TE expression in these cancers; the reverse was observed in four cancer types (Supplementary Fig. 2a, d, e). We highlight 13 TEs subfamilies from the LINE, LTR, and SVA class that showed recurrent significant inverse correlation between expression and proximal DNA methylation across cancer types (Fig. 3h–j). We searched the matching MHC class I peptidome and whole proteome data for translational products of TE by performing peptide identification and label-free quantification based on an augmented human proteome that included TE sequences from 51 overexpressed TE subfamilies Using this approach, we identified 83 unique MHC-presented peptides derived from TEs and chose a subset of 39 peptides that were detected at least three times across all samples for further analysis. Expression of two subfamilies with the most number of peptides detected: SVA_D and LTR12C, were both strongly associated with DNA damage repair and homologous recombination in the TCGA KIRC cohort (SVA_D Spearman cor > 0.58, FDR < 4e-36, LTR12C cor > 0.61, FDR < 2e-40)

Peptides detected only in Aza
Methods
Code availability

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.