Abstract

BackgroundThe exonization of transposable elements (TEs) has proven to be a significant mechanism for the creation of novel exons. Existing knowledge of the retention patterns of TE exons in mRNAs were mainly established by the analysis of Expressed Sequence Tag (EST) data and microarray data.ResultsThis study seeks to validate and extend previous studies on the expression of TE exons by an integrative statistical analysis of high throughput RNA sequencing data. We collected 26 RNA-seq datasets spanning multiple tissues and cancer types. The exon-level digital expressions (indicating retention rates in mRNAs) were quantified by a double normalized measure, called the rescaled RPKM (Reads Per Kilobase of exon model per Million mapped reads). We analyzed the distribution profiles and the variability (across samples and between tissue/disease groups) of TE exon expressions, and compared them with those of other constitutive or cassette exons. We inferred the effects of four genomic factors, including the location, length, cognate TE family and TE nucleotide proportion (RTE, see Methods section) of a TE exon, on the exons’ expression level and expression variability. We also investigated the biological implications of an assembly of highly-expressed TE exons.ConclusionOur analysis confirmed prior studies from the following four aspects. First, with relatively high expression variability, most TE exons in mRNAs, especially those without exact counterparts in the UCSC RefSeq (Reference Sequence) gene tables, demonstrate low but still detectable expression levels in most tissue samples. Second, the TE exons in coding DNA sequences (CDSs) are less highly expressed than those in 3′ (5′) untranslated regions (UTRs). Third, the exons derived from chronologically ancient repeat elements, such as MIRs, tend to be highly expressed in comparison with those derived from younger TEs. Fourth, the previously observed negative relationship between the lengths of exons and the inclusion levels in transcripts is also true for exonized TEs. Furthermore, our study resulted in several novel findings. They include: (1) for the TE exons with non-zero expression and as shown in most of the studied biological samples, a high TE nucleotide proportion leads to their lower retention rates in mRNAs; (2) the considered genomic features (i.e. a continuous variable such as the exon length or a category indicator such as 3′UTR) influence the expression level and the expression variability (CV) of TE exons in an inverse manner; (3) not only the exons derived from Alu elements but also the exons from the TEs of other families were preferentially established in zinc finger (ZNF) genes.

Highlights

  • The exonization of transposable elements (TEs) has proven to be a significant mechanism for the creation of novel exons

  • A recent study showed that Estrogen Receptor α (ERα), which is involved in human breast cancer, preferentially targets mammalian interspersed repeats (MIRs) transposons [16]

  • Expression of TE exons and the comparison with constitutive and cassette exons By integrating the UCSC RefSeq annotation for human genome version 19 [37] and the human TE exon information collected in the TranspoGene database [21], we established four exon classes (C1-C4)

Read more

Summary

Introduction

The exonization of transposable elements (TEs) has proven to be a significant mechanism for the creation of novel exons. Because the probe design for a gene is based on the exons included in one or multiple manually annotated or computationally predicted gene models [5], many infrequently used exons are never represented by any probe, and their expression levels cannot be measured . This problem can be circumvented by RNA-seq technology, which provides hypothesis-free single nucleotide resolution of gene expression so that, theoretically, any expressed sequence can be detected and quantified, given appropriate computational/statistical methods and sufficient sequencing depth. Analysis of single nucleotide polymorphisms (SNPs) revealed that exonization of TEs can be populationspecific, implying that exonizations may enhance divergence and lead to speciation [20]

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call