Abstract
Human Long interspersed element-1 (L1) retrotransposons contain an internal RNA polymerase II promoter within their 5′ untranslated region (UTR) and encode two proteins, (ORF1p and ORF2p) required for their mobilization (i.e., retrotransposition). The evolutionary success of L1 relies on the continuous retrotransposition of full-length L1 mRNAs. Previous studies identified functional splice donor (SD), splice acceptor (SA), and polyadenylation sequences in L1 mRNA and provided evidence that a small number of spliced L1 mRNAs retrotransposed in the human genome. Here, we demonstrate that the retrotransposition of intra-5′UTR or 5′UTR/ORF1 spliced L1 mRNAs leads to the generation of spliced integrated retrotransposed elements (SpIREs). We identified a new intra-5′UTR SpIRE that is ten times more abundant than previously identified SpIREs. Functional analyses demonstrated that both intra-5′UTR and 5′UTR/ORF1 SpIREs lack Cis-acting transcription factor binding sites and exhibit reduced promoter activity. The 5′UTR/ORF1 SpIREs also produce nonfunctional ORF1p variants. Finally, we demonstrate that sequence changes within the L1 5′UTR over evolutionary time, which permitted L1 to evade the repressive effects of a host protein, can lead to the generation of new L1 splicing events, which, upon retrotransposition, generates a new SpIRE subfamily. We conclude that splicing inhibits L1 retrotransposition, SpIREs generally represent evolutionary “dead-ends” in the L1 retrotransposition process, mutations within the L1 5′UTR alter L1 splicing dynamics, and that retrotransposition of the resultant spliced transcripts can generate interindividual genomic variation.
Highlights
Long interspersed element-1 (L1) is a non-long terminal repeat retrotransposon that comprises approximately 17% of human genomic DNA [1]
The funder had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript
Long interspersed element-1 (L1) sequences comprise about 17% of the human genome reference sequence
Summary
Long interspersed element-1 (L1) is a non-long terminal repeat (non-LTR) retrotransposon that comprises approximately 17% of human genomic DNA [1]. Over 99.9% of human L1s cannot retrotranspose due to 50 truncations, internal DNA rearrangements, or point mutations that inactivate the L1-encoded proteins [1,2,3,4]. The average diploid genome harbors approximately 80–100 full-length retrotransposition-competent L1s (RC-L1s) [5], including a small number of expressed [6,7,8], highly active (i.e., “hot”) L1s [5,9,10,11] that can retrotranspose efficiently in cultured cells or cancers. Human RC-L1s are approximately six kilobases (kb) in length [17,18] They contain a 50 untranslated region (UTR) that harbors both sense [19] and antisense [20] RNA polymerase II promoters (Fig 1A) as well as an antisense open reading frame (ORF0) [21], which encodes a protein that may mildly enhance L1 retrotransposition efficiency. L1s end with a 30UTR, which contains a conserved polypurine motif, a “weak” RNA polymerase II polyadenylation signal, and a variable length polyadenosine (poly(A)) tract (Fig 1A) [17,23,24,25]
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have